Why Data Science?

Virtually every industry and business in today’s evolving world is using data science to collect relevant statistics resulting from systematic computation of big data to methodically divulge trends, patterns, investment, operations, targeted marketing, advertising, customer behaviors, predictions, preparations, identification, fraud prevention, employee performance evaluations, and many more valuable information that will give them a competitive advantage. With boundless of data available thanks to the advent of technology and the internet, businesses are turning their heads towards data mining to extract pertinent information for the purposes of data cleansing and analysis. Data science, data mining, and analytics are all used interchangeably today. However, there is a subtle difference between the three. Data mining is the operational extraction of data that are germane to effective problem solving, and analytics provides paramount observation that yields actionable insights into the foreseeable future. And last but not least, data science is a collection of fundamental principles that supports the extraction, identification, and observation of the data. As a result, data science is the broad topic that branches out to data mining and analytics, which are used to gather and filter critical information to assist us in knowing what we don’t know.

Data science is growing exponentially in contemporary entities. Ranging from big to small, public to private, government to entrepreneurs, all associations are focusing more than ever on exploiting useful data to solve problems and make good, strategic decisions. Evidently, data science has become more powerful than ever as the past 15 years have seen an unprecedented growth and opportunities in such field.

The study of data science helps people in answering questions such as when to release this new movie? Where on this street should I open a new Dunkin’ Donuts store? In my grocery shop, should I move the baked goods aisle next to the fresh fruits section? Now let’s take a look at some everyday scenarios as we learn more about analyzing data.

Movie distributors such as Columbia Pictures and 20th Century Fox use data science to assist them in finding a good time to schedule for their films’ release. The data they collect can help them to identify which dates, based on important seasons, holidays, and other competitors’ film release dates, are ideal for them to maximize their profit while minimizing money lost to competitors’ movies on the opening weeks. Similarly, chain restaurants such as Dunkin’ Donuts, Taco Bell, and Waffle House also use data science to guage good locations that see a lot of traffic and have few restaurants that serve similar foods, which in return can maximize both their visibility and, of course, profit.

In marketing, companies use data science to understand consumer behaviors in order to retain customer and prevent them from churning. Churning, in business terminology, means customers switching to other companies. Apple, Delta Airlines, AT&T, AARP, Six Flags, Target, and Walmart concentrate on data science because it offers them a good look on marketing performances. First, let’s start with Apples and Delta Airlines. These two analytically savvy companies employ data science to shrewdly decide their products’ pricing in an effort to capture all the sales. When iPhone 6 Plus was released in 2014, Apple put a price tag that charged $299 for 16GB and $399 for 64GB. This tactic made buyers think the latter is way better than the former because it saves money. To make the matter more interesting, Apple charged $499 for 128GB. Correspondingly, Delta Airlines implements the same ruse on their seats and the respective prices. This explains why seats in the economic class are so bad but so cheap! Thanks to data science, companies always purposely make certain products very inferior and very inexpensive at the same time so they can encourage customers to spend more than necessary.

Next, let’s look at publication. AARP, a lifestyle magazine, utilizes data science to determine which types of people in what age group, race, gender, occupation, place of residence, income, are more likely going to subscribe and which types of people in these groups are least likely going to subscribe, or even unsubscribe. And of course, they evaluate those information to determine the readers who are not likely going to buy their magazines and they mail those people special coupons that offer some discounts. Similarly, Six Flags offers special discounts to local residents, who, upon providing proper documents and zip codes, can enjoy some sweet discounts, because people who live nearby have very little incentive to go to its amusement parks as opposed to foreigners and non-locals. Moreover, Six Flags also use data science to determine when to build a new attraction to pique the crowds’ interest, hence continuously maintaining its popularity as well as increasing its profit.

Speaking of Walmart and Target, both have very huge databases that support the two firms to assess customer behaviors. For Target, it has a database that stores customers’ purchasing behavior so they can know their customers’ socioeconomic status and income based on what they buy. And interestingly, they use that exact same tool to determine who is having a baby soon and then aggressively send them offers because as long as they entice one pregnant customer, then that person will buy all the baby products–such as baby foods, toys, clothes, infant care kit, diapers, wipes–from Target. For Walmart, the mega giant uses data science to project the amount of sales based on seasons, holidays, and impending natural disasters like hurricanes and snowstorm, to anticipate what kind of goods they should stock in advance. Surprisingly, according to Walmart and the New York Times, besides the obvious such as water bottles, flashlights, and non-perishable goods that will be bought before an incoming hurricane, products like beer and condoms also see a substantial spike during those times.

Furthermore, credit cards and insurance companies heavily rely on data science (or more specifically, analytics) to decide if a potential customer is eligible for their services based on their age, income, gender, assets, debts, properties owned, marriage status, past histories, and many more categories.

Now let’s take this subject to hospitals. Hospitals have a map that details the demographic dynamic in the vicinity. The map show the population’s general age, ethnicity, religions, socioeconomic status, and many more to anticipate which district is more likely going to occur accidents. For example, hospitals strategically station more ambulances in an area that has more Latino denizens around Cinco de Mayo. The same approach is also used for Chinese New Year and Yom Kippur. Additionally, this extends to not just the population alone, but also the weather (like extreme cold) and events (ex: Super Bowl and July Fourth). Therefore, this explains why when you dial 911, an ambulance will arrive in less than four minutes. According to an EMT, this method of predicting the probability of accidents and death is actually 94% accurate! Likewise, the same tactic is also applied to the government to fight crimes.

Now we have seen a few examples on the magnitude of how data science can impact the world–not just business only but literally every single matter and issue that surrounds us. Finally, let’s apply this topic onto sport. Besides the apparent ones like ticket and apparel sales, the marketing department also carefully analyzes fans’ behaviors by deciding if certain products will discourage season ticket holders from committing to the team if they are given away to all fans as part of game promotion. Partly the marketing department’s responsibility, they also decide if the franchise’s name and logo can relate to the fans because having an iconic branding can directly reflect the attendance. This important relationship with the fans is the reason behind why the Washington Bullets changed their name to the Wizards. It’s also why Michael Jordan changed Charlotte Bobcats’ name to the Hornets because he figured the Hornets are a better name that can capture the fans’ hearts.

Digging in to a deeper level, data science can be applied to the operations side of sport, which is extremely crucial to a team’s success. With data science, and subsequently analytics, a team can decide what kind of roster it should construct and what type of players they should keep or let go. Analytics tackles such problems in the dynamic sport industry. Analytics helps coaches to figure out the optimal lineup that can generate the best offense and defense. Moreover, it is also used to evaluate the opponents’ performance by determining what type of strategy is the most effective when used against this specific opponent based on its tendency. For example, on December 1, 2015, the Washington Wizards upset the Cleveland Cavalier in a surprising victory 95-87, handing them their first home loss of the season. Then-head coach Randy Wittman used an unheralded “ultra-small” lineup that featured Jared Dudley as the center. Dudley, along with John Wall, Bradley Beal, Otto Porter, and Garrett Temple beat the Cavaliers and their “Big 3” in LeBron James, Kyrie Irving, and Kevin Love in a shocking fashion. The “ultra-small” lineup forced the Cavaliers to keep up with the Wizards’ up-temp offense as they spread the floor very well and exploited the defensive inefficiency by driving aggressively and actively looking for the open men under the basket or at the 3-point line. Defensively, despite being undersized, this lineup proved to be successful in stifling the Cavaliers’ offense through quickness and constant switching on the defensive rotation.

And the fun doesn’t stop here. Analytics can help predict which college prospect is NBA-ready and can translate to the next level; it is also used to come up with a Plan B and Plan C on draft day based on the probability of the potential players being selected before the team’s turn. Going into the “internal” business side, analytics is used to determine how much money a player should receive since in a sport team, there is never enough money to allocate to every single player from the super star to the bench (and the same applies to the real world, too). General Managers use analytics to evaluate player performance and use that as the leverage to decide the length and the amount of the contract, based on injury history, potential growth, future free agents, and the team’s current financial situation. On the other hand, analytics also casts significant effect on deciding which player should be released for other better candidates because in a basketball team–you only have 15 roster spots and that’s it! Of course analytics alone does not decide for everything, but it is very powerful and certainly extremely influential.

With that in mind, this explains why I started this website. While I won’t go into the details of data science, but I will try my best to provide a more analytical view towards the game of basketball. I avoid using as much math as possible as I like to keep things simple. Attributing his success to coach Popovich, Mike Budenholzer said in 2015 when accepting his Coach of the Year award that being around with coach Popovich, he learned that “sometimes the most successful things are very, very simple.” While this does not apply to analytics, and is in fact, the antithesis of it, I promise to keep things simple by instilling basic principles derived from my statistical findings here on this website as I intend to let all readers enjoy my contents as much as possible. A student of the game, I love to share my knowledge and I believe the world will be a better place with free knowledge–and the first step to achieve that is by starting this website.

 

–J.H. Yeh

%d bloggers like this: