- ask the AI to filter out the ones that are not tech-related (bay-area topics, politics)
- scrape selected articles
- write summaries
- publish static website
other sources are reddit subreddits and rss feeds (official languages' blogs and github's releases page).
The AI is quite gullible. That can be avoided by giving it more context and having a review step where you make sure that you are enforcing your editorial rules.
Another thing that I've been wondering is having a cheaper model (gpt-turbo-3.5) write articles and then use a more sophisticated to review them (gpt4)
I'm curious, can you ballpark how many tokens you use for the round trip of link ingestion and article creation? I haven't really tinkered with LLMs, so I'm trying to wrap my head around the cost of projects like this.
I run the script every 6hrs. When a candidate list is built, it has around 15 items + the prompt. That usually goes around 1k tokens. When generating an article, the prompt has 1k tokens and the response has around 1k tokens (I ask for text with up to 250 words). If the source article is long I just truncate it - it should be enough to create a summary presenting the topic. It is possible to batch article creation, submitting the prompt once alongside 2-articles, but I don't think that it would be worth the hassle. I'm getting billed 1c for every 12-15 articles generated (using gpt3.5turbo
I particularly appreciate that you have a disclaimer at the top stating that the article was AI generated. I wish more sites that used AI did that. Nice execution too.
- pick HN items with more than 400 votes
- gather their titles in a list
- ask the AI to filter out the ones that are not tech-related (bay-area topics, politics)
- scrape selected articles
- write summaries
- publish static website
other sources are reddit subreddits and rss feeds (official languages' blogs and github's releases page). The AI is quite gullible. That can be avoided by giving it more context and having a review step where you make sure that you are enforcing your editorial rules. Another thing that I've been wondering is having a cheaper model (gpt-turbo-3.5) write articles and then use a more sophisticated to review them (gpt4)