1 / 20

10 Facts About Data Science You Should Know

This presentation deals with 10 amazing facts that one must know about data science. For more - https://www.henryharvin.com/business-analytics-course-with-python

TanyaAg
Télécharger la présentation

10 Facts About Data Science You Should Know

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BUSINESS ANALYTICS WITH PYTHON 10 Facts About Data Science You Should Know HENRY HARVIN EDUCATION

  2. Data science is the talk of the town in the present-day world. It came into picture in 2008 when with the advancement of internet and device connectivity, immense flow of data was observed. A need for professionals who could handle and analyse the huge bulk of data of all kinds, was felt. Since then, data science has gained a great momentum with certain myths and facts associated with it. We shall discuss some of those facts in this blog. The volume of data worldwide is growing at an astronomical rate. Data produced in the last two years alone is greater in volume than all of the data produced before that. This clearly shows how important it is to have data science roaring to handle this data eruption across the platforms.

  3. WHAT IS DATA SCIENCE? Data science is the scientific approach to handle data and make the most out of it to drive a business. Now from handling data to making the most out of it comes the real heroic deal to manifest in the world of crooked and prodigious data.

  4. DATA – THE GAME CHANGER Data science was born out of data. The volume of any type of data today cannot be underrated. Data is truly the game changer to have produced this entirely new field of data science. It is therefore customary for anyone interested in data science to know some facts first about data itself before steering for the data science facts. Reckoning that, let us now check out a couple of knowledge bytes around data.

  5. If someone asks the fundamental – “How much data exists in the world today?”, there is probably no definite answer to this. But it is estimated that by the year 2025, 463 exabytes of data will be created daily. • We create new data every millisecond. We make 40k search queries on google each second, which amounts to 1.2 trillion searches per year. • Data coming into any field or business is now no more just data, it is Big Data. • Since data science revolves around data and is gaining a remarkable gravity, there are many interesting trivia trails in the data science arenas too, just like data, that deserve a heed. If you are a data science enthusiast, this article is sure to enamor you more towards it. What is the Biggest Data Unit You know? From our childhood we have been reading about the different units of digital data starting from bit, byte to even gigabyte or terabyte. But it is exciting to know the measuring units of all the big data that is floating around and driving the engines of big and small businesses.

  6. TOP INTRIGUING FACTS ABOUT DATA SCIENCE

  7. DATA ANALYST DATA SCIENTIST 10. Data Scientists and Data Analysts are NOT the Same This is a common myth among the people having a superficial idea about data science. Reality is, the work of data scientists and data analysts is totally different. Whereas data analysts work on finding the trends and analyzing the data, data scientists work on finding the cause of a trend and forecasting the upcoming trends. As data science is a new field, popping up of certain misconceptions is inevitable. However, it is worth noting that the two work in tandem. They complement each other and work for a common goal. Now let us check out some of the basic differences between the two. • Discovers unexplored questions that may need an answer. • Skillset : Algorithms, data mining, programming, database management, data analysis, machine learning, predictive analysis • They estimate the unknown data • They choose to address business problems that would have maximum effect • They work at macro level • Uses existing information to get workable data on existing questions • Skillset : Data mining, modeling, programming, statistical analysis, database management, data analysis • They work with known data set • They address the business problem assigned to them • They work at micro level

  8. ABOUT DIRTY DATA – 9. Data is Never Clean • Dirty data is of one or more of the following forms-Incomplete • Duplicate • Irrelevant • Inaccurate • Incorrect • Misspelled That’s true. Data is nasty. Even when data is collected and cleaned with an eagle eye, some or the other data discrepancy does creep in at some point. And data scientists know to work with data chaos and noise, while cleaning it on the way. The collected data being dirty is one problem. But the bigger problem is joining multiple datasets into a single entity. Now data can have been collected from different sources by different people, softwares, devices etc. There is a huge possibility of them being non-coherent. The join key may not be consistent or the format may be different for different systems. Data scientists clean the entire data by re-formatting, screening, organising and so on. Question now is, if data is so unclean, how then analysis is done out of it? Well, at this point it would be good to paraphrase, in the end, data is clean enough to reach a desired outcome. There are several data cleaning techniques which are implemented at every step to reach the least dirty form of data. And this becomes the basis of the final analysis.

  9. 8. YOU DO NOT NEED TO BE A TECH SAVVY OR HOLD A PHD TO LEARN DATA SCIENCE

  10. Data science sounds like a field of tech savvy professionals and this leads to the common misbelief that to be eligible to learn data science, one needs to have a super brain or hold a PhD degree. This is absolutely incorrect. As a matter of fact, anyone with an average intelligence can learn data science. • Data science learning involves upskilling in the below fields – • Statistical modelling • Predictive modelling • Machine learning • Programming • Algorithm • Analytics In a nutshell, data science is not as heavy as it may look. Just an empathy towards possibilities is the requisite. Rest fall in place along the learning.

  11. 7. Data Science is Not Just Excel Sheets Contrary to the aforementioned belief, this one can seem surprising but many people are of the opinion that the life of a data scientist revolves around excel sheets. This is anything but true. As mentioned before, data science is a vast field with basic focus on the correct and intended outcome. And to get that outcome, the data science professionals fight tooth and nail. They use different data analytics techniques, SQL query, statistical analysis, predictive analysis and what not. There was once a time when excel sheets played a major role in arriving at a conclusion and making analysis using formulae and calculations. At present with an easy availability of programming tools like Python and R, most of the data scientists spend a great portion of their time coding rather than on excel sheets.

  12. 6. Data Science Competitions and Real Life Projects are Different Getting a success in a data science competition(eg. through an online platform like Kaggle) may give a boost to one’s confidence so much that one starts thinking of landing to a data science career. But it is here to understand that there is quite a lot of difference between a competition and a real-life scenario.

  13. DATA SCIENCE COMPETITIONS REAL-LIFE PROJECTS • There is no limit on data and datasets. It’s the data that matters. • There is no warning. You only learn after you have committed a mistake and borne the consequences. You go back all over again and do some data cleaning and rework. • You need to rewrite the code every 5-15 minutes. • You definitely deploy your model • Authentication and security is equally important as the data itself. • Number of datasets is limited • In online competition platforms, a warning is given when you have made an error • You need to write the code just once • You do not need to deploy your model. • There is no authentication or security So, it would be safe to say that competitions do give a fair practice for data science. But it is not enough. You need to make your hands dirty and work in the live real-time projects to know the correct essence of data science.

  14. Let us understand this point with reference to data through the bottom-up approach. Suppose we have a dataset with the exact number of minimum data that is needed to make a correct analysis. This would be an ideal dataset. Now if we add some more data, the entire dataset will need to be reconstructed considering the new set of data as well. While reconstructing, there will be a need to clean the new data and spend time to understand their deviation from the existing set, if any. Now even after the new data is cleaned and merged to the existing ideal dataset, there is a possibility that some new element is still dirty but unidentified. This will lead to an overall degradation of the final result or analysis. In this case, lesser data was surely better than more data. 5. More Data Does Not Always Mean More Accuracy Hence, more data doesn’t mean more insight or more value addition. Using smart data is the key.

  15. 4. Data Science Field has Different Roles, Not just Data Scientists • Data science includes all of these – • Data engineer – They are responsible to manage data infrastructure throughout the data science lifecycle. Basic skills include – programming tools like python, database tools like NoSQL and big data tools like Hadoop. • Data analyst – They find answers of questions by working through the data available, using appropriate tools. Basic skills include – programming, data visualisation, statistics, mathematics and of course data analysis. • Data scientist – Data scientists work on big data, analyse it and then communicate the finding through reports and presentations. Basic skills include – statistics, mathematics, programming,datavisualisation, SQl, Hadoop, machine learning. Apart from these too you can make your career in data science through various other roles.

  16. 3. DATA SCIENCE IS NOT MEANT ONLY FOR LARGE ORGANIZATIONS Many businesses believe that data science is meant only for big organizations having high class infrastructure. Such belief pops out from a wrong notion about data science. Data science is not made up of machines, heavy tools or the size of working resources. It perhaps is made up of big data, statistics, analysis, programming, presentation and some smart people who know how to make the best out of data and add value to the organization. It has nothing to do with big or small organizations. A data scientist needs to arrive at a result that benefits the company. And no one really cares as to what tools and techniques have been used to achieve t hat result. Coming to infrastructure, all that is needed is a computing device, internet and some tools that help through the data science life cycle. There are a number of open source tools available online that can be downloaded to get the ball rolling.

  17. WRITING INVOLVES 2. DATA SCIENCE NEEDS GREAT COMMUNICATION SKILLS • Communication and presentation play a key role in data science.Communication here refers to two areas – • Coordinating within and among the teams during the different stages of data science life cycle. • Presenting the final outcome in the most comprehensive and lucid manner. • Without a proper communication, the entire exercise may fall futile. It may not project into any substantial product. It is important to learn public speaking as there are a lot of presentations involved. • Powerpoint • Blog • Email • Report An analysis without a proper communication in writing or otherwise, is just a placeholder with no significance.

  18. Data Science is Not for Everyone There are lots of videos and articles on the web suggesting anyone can be a data scientist. It’s true with certain conditions. It is always a good idea to ask yourself first, why do you want to be in this field. It is good to do some reality check before taking a blind leap. Introspection in the start is a great virtue for a successful stint in any field.

  19. CONCLUSION Data science is becoming inevitable with data explosion in almost every field. It offers a good career opportunity. Thinking of data science as a career option can be a wise decision for anyone who enjoys problem solving and has data empathy. As cool as it sounds, it has immense potential for both business as well as for job seekers. But it is advisable not to fall for any wrong information about the field. With its growing popularity, data science has got some myths associated, that we saw along with some interesting facts. Let me know in the comments if I missed any point.

  20. WEBSITE PHONE MAIL GET IN TOUCH https://www.henryharvin.com +91 - 9015266266 info@henryharvin.com

More Related