Around the New Year, I put together a survey inspired by something the soccer analytics community did last year. Thom Lawrence put it together and it was very informative and cool to see. He’s also a good follow on twitter. This post will go over the results from nearly 500 responses.
Let’s jump right into it. The first question of significance is which side of the debate do you fall on? It was good to see that nearly half of the people filling out the survey didn’t identify as stats-first people. I thought this entire chart might be 80 – 90% stats affiliation.
If you check out the soccer survey linked above, you’ll find the percentages are eerily similar. About 5% more people in this survey identified as System Writers than the Tactics section for soccer, but that was the largest difference. I wonder how many people would say they are both system and stats folks as that’s the best of both worlds. I do think that there is certainly a healthy amount of skepticism on each side, for good reasons: it’s easy to arrange video clips to show what you want show, and it’s also easy to assemble a few data points (especially without testing them) to support a narrative.
That being said, there are plenty of good people out there that combine both to explain, instruct, and analyze this silly game we all spend so much time on.
So, let’s see how
old young everyone is.
Naturally, there’s a lot of youth in the community, but it’s not like it’s as drastic as, say, the falloff at the end of Wade Redden’s career. You can see the percentage of total respondents that fell into each bucket as well. Let’s see how those employed by pro and amateur teams break down.
No one under the age of twenty currently or has worked with a pro team either in Europe or a NHL team. That’s not surprising. From there we see a number of people across all age groups that are currently working with teams, as well as those that have worked with them in the past. Only about 5% of all respondents are either currently working with a NHL or European team or have in the past. From this, I count sixteen people currently working with NHL/European teams. Can you name them all?
Many people will get the proverbial “foot in the door” working with a NCAA or CHL team. Also, these teams stand to benefit from even getting additional, basic data on their own players and opposition players as well. I’d wager many of those currently working are students or recent grads. About 16% of all respondents either currently work with a team at some level or have in the past.
And let’s not forget there is obviously a difference between being offered a small consulting salary compared to a full time salary. I should have also asked how many people have turned down a position and what the reason was. I’ve heard of more than a few people turning down positions because of pay or relocation.
Let’s see how smart we all are.
A great many of us came to hockey analytics not having a traditional math education beyond high school. I know I didn’t get much practice at regression modelling while studying Antebellum America and other historical periods of revolution and civil unrest. I might have been sick the day my professor addressed how repeatable the Fugitive Slave Act was (spoiler alert: it wasn’t).
However, there are many classically trained mathematicians and statisticians that educate the rest of us from time to time. We also live in an age where furthering your knowledge about basic research and statistical principles is at your fingertips.
What’s also interesting is that many of the people working for teams identified as self-taught as well.
Tools & Technologies
Let’s see what tools we use to show how smart we are.
We are the master of the spreadsheets. R looks to be twice as popular as Python. I was surprised to see so few Tableau users. I think that’s a fantastic way to make your data exponentially more accessible if you’re already working in Excel files. I like using it.
For those unfamiliar with programming languages or where to acquire those skills, here’s where the respondents learned these new skills.
There were some others like the popular Analyzing Baseball Data with R, YouTube, or those currently enrolled in college courses. I’m currently working through DataCamp’s courses in-between changing diapers. Looks like Manny is doing quite all right with his R courses.*
Where Do We Get Our Data?
Lots of people use sites, track data for projects, or build their own scrapers and databases from which they work.
I think the fact that data sites allow you to download the data into CSV files to save and use for various articles or research is the greatest thing about them. We really should thank the people that create these sites and allow such easy access for FREE more often.
Some of these categories overlap, but for the twenty-eight respondents that aren’t sure where to find data, you can visit any of these sites: Corsica, NaturalStatTrick, HockeyStats, DataRink, and HockeyAnalysis.
Six of the twelve respondents who say they pay for data from a private company currently work for a NHL team. I’m curious who the other six are that pay for it. Media members, perhaps?
Lots of people scrape data and I imagine that number will continue to increase. Also, lots of people manually track data (though I imagine most of those people have worked with me over the past couple of seasons). It’s good to see people continue to do that.
What do we want? Feedback!
When do we want it? Now!
How would we like it to be delivered? Uh….
That’s always been an issue when sharing your work online. There are some that will share it far and wide without reading it because they like you and your previous work. There are those that will read it and not share it. There are those that will pan it or be condescending because of one small aspect they disagree with. There are those that will invent a straw man argument and attack your work on a point you weren’t even trying to make/defend. I’ve seen it all.
When I first started posting and sharing work, it wasn’t always pretty. It wasn’t always good. However, when people commented or took notice, it was an awesome feeling. There are many smart people who share this space and I learned more than I ever thought I would about proper ways to analyze and present data thanks to many of the people in this community.
However, I think we can do a better job of interrogating each other’s work – why is this important? What am I supposed to learn from this? It’s perfectly acceptable to do things for fun or simply to prove to yourself you can put something together. We don’t have to be serious all the time, but I do think we need to delineate between the two a bit better.
There also comes a time where you have to ask if you’re going to take this seriously or not and have your work reflect that. Continuing to simply eyeball data, ignore evidence, and rely on your own opinions, all of that does a disservice to the community.
Most everyone wants feedback on their work. Why would we share it online if we didn’t? The trouble is often how it comes across on Twitter. I also think we’d all get along a bit better if we had a bit more respect of each other’s intelligence. Consider who it is that you’re addressing before you type something.
Which leads me to the next topics…
Things to Improve Upon/Work on for 2017
Some ideas on what the community would like to see going forward:
- More Special Teams work
- Better prospect evaluation
- More work done on microstats
- Better modeling of Quality of Competition/Teammates metrics
- More focus on WAR
- Tactical Analysis
There are many others and I encourage to read through it all yourself, but these were some of the common ones. I do think we’re improving on a lot of those things, but I realize it may take longer for certain metrics/methods of analysis to make their way throughout the community.
Now we come to what people think we need to improve upon going forward:
- Being “more chill” (probably true)
- Make hockey analytics more accessible
- Analytical rigor
- Stop talking about Kris Russell
There were a great many comments on being nicer, communicating more clearly, not being defensive, etc. To those I will say that, yes, we can certainly improve upon those aspects of our daily interactions with each other; however, I will say that is endlessly amusing that the onus is on us to rise above the ignorance and condemnation of the “old-school” crowd, as if they are beyond hope.
Holding Us Back
Speaking of “old-school” hockey guys, let’s take a look at what the community feels is holding us back.
The biggest impediments to hockey analytics, in the opinion of the respondents, is traditional and out-dated perspectives in the media, followed closely by Old School Hockey Guys. This shouldn’t come as a surprise as it’s normally media members antagonizing the analytics community through insults, name-calling, and an aversion to progress and science.
Echoing some of the other responses from above, tracking data is what most of us are waiting for.
So, this is where we are in early 2017. I do agree that communication and the overall environment on Twitter is something that can be improved, but I also think we need to pick our battles and hold people accountable when they are discrediting progress. So, striking a balance between those won’t always be pretty, but it’s something to strive for.
As far as actual work – I think we’ve established many good things given what’s available. I do think we can apply data to more sophisticated tactical analysis. I also think evaluating coaches is something that can be investigated further.
If you want to see or make your own charts off of the respondent data, it’s here. It’s all anonymous, so no one will know how you answered. I did promise when I released the survey that the data would be available after the fact, so this shouldn’t surprise anyone. There’s a lot of interesting responses to go through.