Text Augmentation To Boost NLP Model Performance (Part 2/3)
Data augmentation techniques are used to generate additional, synthetic data using the data you have. Image augmentation has become a standard procedure in computer vision applications but text augmentation is relatively new to natural language processing (NLP) field. The nlpaug module implements a number of high-performance text augmentation algorithms that may boost performance of NLP models. In Part 1 of the tutorial (https://www.youtube.com/watch?v=lpWewl7y57o), we introduced some cool text augmentation functions in the nlpaug module. In this Part 2 of the tutorial, we will use the nlpaug module to generate text augmentations to Twitter tweet data and evaluate bag-of-words model performances with and without text augmentations. Topics include: How to install and import the dependences for running the nlpaug module successfully in Google Colab? How to download required models to be used by the nlpaug module? How to build a bag-of-words model for tweet sentiment analysis? How to manage tweet data using Pandas data frame? How to do word tokenization using NLTK? What are stop words and how can we remove them from the tweet data? What are the differences between lemmatization and stemming? How to lemmatize or stem words / tokens in the tweet data? How to create a function to systematically clean tweet data? How to create a vocabulary for the bag-of-words model using CountVectorizer in the sklearn (scikit-learn) module? What is term frequency - inverse document frequency or TF-IDF? How to transform tweet data to TF-IDF array using TfidfTransformer in the sklearn (scikit-learn) module? How to build a logistic model to classify tweet sentiment? How to use synonym based text augmentation to augment tweet data? How to use contextual word embedding based text augmentation to augment tweet data? How to use back translation based text augmentation to augment tweet data? How to compared bag-of-word model performances with and without text augmentation? Code used in this video can be downloaded from GitHub: https://github.com/DreamJarsAI/Apply-AI-like-a-Pro/blob/main/8%20ML%20Text%20Classification%20with%20Text%20Augmentation.ipynb Hashtags: #nlp #naturallanguageprocessing #text #augmentation #ai #nlptraining #nlppractitioner #artificialintelligence #machinelearning #deeplearning #python #pythonprogramming #pythontutorial #aitutorial #coding
Download
0 formatsNo download links available.