LLMs and TV News Transcripts

We need three things out of the TV transcripts:

Which “topic” was covered in each segment/channel-block?
Each day, was there “new” information about each “topic”?
Each day + topic, was the news in that topic favorable or unfavorable for Trump (Clinton)?

Here I’ll detail how I’m planning to approach these.

1. Identifying Topics in Segments

We’ll build a list of “topics” manually. As a (super simple) example, these will be things like: storms, polls, Democratic convention, Republican convention, general election debates, Clinton scandals, Trump scandals. We’ll aim for 20 or so of these.

I’ll start by using OpenAI’s GPT3.5 and GPT4 APIs to see how well we can classify a few of these topics in a few hundred blocks we’ve already gotten topic weights via STM. I’ll match the STM topics to our manual “topics”, and check statistics on the overlap (and compare differences). I’ll also use this to estimate how expensive it will be to classify all of our transcripts. I’ve been thinking about ways to minimize the number of requests we have to make (one way would be to feed the API a transcript, then ask yes/no questions for all of the topics sequentially. We can randomize the order we ask the questions in to reduce any bias induced by the ordering).

Assuming it will be prohibitively expensive to use OpenAI for all of our data, I’ll use the same structure to use an open source model. I’ll do a set of comparisons between fine-tuned/not fine-tuned versions of the models and what we get from OpenAI and the STM topics. If we go this route, we’ll be able to get probabilities of the binary “is topic X present in this transcript?” questions.

2. Identifying Events/New News in Topics

Having identified the topics present in each block, we’ll analyze the coverage of topics by day to identify whether there’s been an event. To do so, we’ll look at all blocks in each day that cover a particular topic. We need to identify whether there’s been any new news on a topic on each day. I think two approaches are reasonable:

Across days t and t-1, compare summaries of all blocks with topic coverage
Across days t and t-1, compare all blocks with topic coverage

In each case, we would want to know whether the coverage (or coverage summary) of the topic indicates that there was some new information revealed about the topic. This could be expressed as probabilities over yes/no responses, scores from 0-100, or yes/no responses with confidence scores from 0-100. We could use either OpenAI’s models or local models for this.

My preferred approach is to use summaries. This is for a few reasons, one of which is it will be easier to do sanity checks when we watch the segments. It also lowers the likelihood of hallucinations (but definitely doesn’t eliminate it). It seems like for most LLMs, retrieval of specific information is worse (and hallucinations more frequent) when context lengths get beyond ~16k tokens, which using summaries will help with.

Identifying Event Favorability

For each event, we need to quantify whether the information revealed on the topic is favorable or unfavorable to Trump/Clinton. This can be done in one of a few ways.

Based on summaries from Part 2 above
For each day, based on all blocks with topic coverage (same sample as Part 2)
For each day that “begins” an event, based on all blocks with topic coverage (subsample of Part 2)

The setup here would be very similar to that of Part 2, we’d just be asking a different question. In each case, we would want to know whether the coverage (or coverage summary) of the topic indicates that the event was favorable or unfavorable to Trump/Clinton. All the possibilities outlined in Part 2 still apply.