L12.1: Text Generation
In this video I start our unit on Generative AI using Deep Learning networks. In this video I look at generating sequences, and I use generating text sequences as an example. You can generate discrete sequence data by training a model to predict the next token given previous tokens. We have run across this type of training to predict the next token before in our decoder for sequence-to-sequence translation. A model that is trained to predict the next token of a language/text is called a language model. A language model captures the latent space of a language, its statistical structure. For good text generation, you can use a trained language model to feed it input text tokens 0..N and ask it to predict the N+1 token. Then append that token to the end of the sequence and repeat. However, for good generation you need to take into account your sampling strategy. It is not always best to just pick the most highly likely next token, but instead to sample randomly from the token probability distribution, to generate the best sounding output sequences. Resources: Textbook: Chollet (2022). "Deep Learning with Python (2ed)". Manning. https://www.amazon.com/dp/1617296864/?bestFormat=true&k=deep%20learning%20with%20python&ref_=nb_sb_ss_w_scx-ent-pd-bk-d_de_k0_1_15 CSci 560 Class Repository: https://github.com/csci560-nndl/nndl Contains video slides and iPython notebooks for this course. 00:00 Introduction 02:52 How do you generate sequence data? 07:36 The importance of the sampling strategy 12:55 Implementing text generation on movie review dataset 25:24 Summary
Download
0 formatsNo download links available.