πŸ“¦

News Pipeline

Automated AI research digest with LLM-powered summarization

Active v1.0.0 MIT
[vm] localhost :N/A

Tech Stack

Go Temporal Claude CLI Mattermost systemd

Requirements

  • β€’ Go 1.21+
  • β€’ Temporal Server
  • β€’ Claude CLI
  • β€’ Mattermost (optional)

Features

  • βœ“ Multi-source paper fetching
  • βœ“ Haiku summarization
  • βœ“ Sonnet relevance ranking
  • βœ“ Mattermost channel posting
  • βœ“ Markdown digest generation
  • βœ“ Optional TTS audio
  • βœ“ Systemd timer scheduling

Staying Current with AI Research

The AI/ML field moves fast. Papers drop daily across multiple platforms:

  • arXiv dumps new preprints constantly
  • HuggingFace Daily Papers curates trending work
  • Papers With Code tracks implementations

Manually checking each source is tedious. News Pipeline automates the entire flow: fetch, summarize, rank, and deliver.


Architecture

A Temporal workflow orchestrates the daily pipeline:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    TEMPORAL WORKFLOW                         β”‚
β”‚    Fetch Sources β†’ Summarize β†’ Rank β†’ Post β†’ Archive        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚                   β”‚                   β”‚
        β–Ό                   β–Ό                   β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  arXiv  β”‚        β”‚HuggingFaceβ”‚       β”‚PapersCodeβ”‚
   β”‚   API   β”‚        β”‚   Daily  β”‚        β”‚   API   β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Claude CLI   β”‚
                    β”‚ Haiku+Sonnet  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό             β–Ό             β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚Mattermostβ”‚   β”‚ Markdownβ”‚   β”‚  TTS    β”‚
        β”‚ Channelsβ”‚   β”‚ Digest  β”‚   β”‚ Audio   β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Workflow Steps

StepModelPurpose
FetchN/APull papers from 3 sources
SummarizeHaikuGenerate concise summaries
RankSonnetScore relevance, add labels
Post RawN/ADump to Mattermost raw channel
Post DigestN/ACurated top items to digest channel
ArchiveN/ASave markdown to ~/.news-pipeline/digests/

Deployment

Runs as a systemd service with timer-based scheduling:

# Install (builds + installs systemd units)
./install.sh

# Triggers at 6 AM and 6 PM daily
sudo systemctl status news-pipeline-trigger.timer

# Worker runs continuously
sudo systemctl status news-pipeline-worker

Manual trigger:

./bin/news-trigger --wait

Output

Digests are saved to ~/.news-pipeline/digests/ as YYYY-MM-DD.md:

# AI/ML News Digest - 2025-12-28

## Top Papers

### 1. [Paper Title](https://arxiv.org/abs/...)
**Relevance**: 9/10 | **Labels**: transformers, efficiency
**Summary**: One-paragraph summary from Haiku...

### 2. [Another Paper](https://huggingface.co/papers/...)
...

Optional TTS generates audio summaries for top items.


Configuration

VariableDefaultDescription
TEMPORAL_ADDRESSlocalhost:7233Temporal server
MATTERMOST_URLchat.homelab.comMattermost server
MATTERMOST_TOKEN(required)Auth token
NEWS_DIGEST_DIR~/.news-pipeline/digestsOutput directory
CLAUDE_PATH~/.nvm/…/claudeClaude CLI path

Summary

BenefitDescription
AutomatedNo manual checking of multiple sources
CuratedLLM ranking surfaces relevant papers
DeliveredMattermost integration for team sharing
ArchivedMarkdown digests for future reference