Text Preprocessing
StartText is everywhere, and knowing how to clean it will transform your data science skillset. Many in the industry estimate that 80% of data science is data cleaning, including text preprocessing. Transforming text into usable data requires specialized tools and techniques. This course introduces text cleaning with Python 3 using regular expressions (regex) and NLTK.
Codecademy courses have been taken by employees at
- 1Get a taste of regular expressions (regex), a powerful search pattern language to quickly find the text you’re looking for.
- 2Before most natural language processing tasks, it’s necessary to clean up the text data using text preprocessing techniques.
How you'll master it
Stress-test your knowledge with quizzes that help commit syntax to memory

— Madelyn, Pinterest“I know from first-hand experience that you can go in knowing zero, nothing, and just get a grasp on everything as you go and start building right away.”