From Links to Reading Insights

This is Part 2 of my journey building a Personal Reading Analytics Dashboard. Catch up on Part 1 - From PIi to Cloud Automation

Article Extractor → Dashboard

In Part 1, I shared how a simple Python script—born from the desire to collect links from my favorite blogs—evolved from manual runs on my laptop to a scheduled GitHub Actions workflow. But once articles were reliably flowing into Google Sheets, a new question emerged:

What if I could understand my reading habits and not just collect them?

That question sparked the next phase: building a personal reading analytics dashboard, powered entirely by open tools and zero infrastructure.

The goal shifted from “Where are the new articles?” to questions that reveal real behavior and patterns:

  • Am I actually reading what I save—or just collecting links?
  • Which sources give me the most value versus the most backlog?
  • How has my reading habit changed over months or years?
  • What’s the age of my oldest unread articles, and which sources contribute most to my unread pile?

To answer these, I moved beyond raw data and built a metrics engine that computes reading progress, backlog aging, and source-level analytics. This engine powers a full dashboard with seven interactive visualizations—tracking everything from monthly trends to unread article age distribution—without relying on servers, databases, or paid infrastructure.

The Pipeline: Simple, Batched, and Static

Here is how it works today:

Go for Metrics, HTML for Portability

The analytics engine is a lean Go program that transforms raw article data from Google Sheets into meaningful reading insights. It computes:

  • Reading progress: total articles, read vs. unread counts, and overall read rate with visual emphasis
  • Temporal patterns: articles per month, seasonal trends, and publication year distribution
  • Backlog analysis: unread article age buckets (<1 month, 1–3 months, 3–6 months, 6–12 months, >1 year)
  • Source-level metrics: read percentage per source, most and least read providers, and Substack author averages
  • Spotlight highlights: top 3 oldest unread articles with direct links and age calculations

All results are written to a timestamped JSON file like metrics/2025-12-29.json. At runtime, the dashboard simply loads the latest file—no database, no API, no dynamic server.

A second Go process uses html/template to generate a static index.html with seven interactive Chart.js visualizations. The entire site deploys to GitHub Pages via GitHub Actions, delivering a live, zero-cost reading analytics dashboard that updates daily.

Summary

  • Phase 1: Automate article collection with GitHub Actions
  • Phase 2: Add metrics and visualization using Go and GitHub Pages
  • Result: A personal, zero infrastructure reading analytics dashboard

Sometimes, the best observability starts at home—with your own habits.

Thank You

Thank you for reading! You’re awesome, and I hope this post helped. Until next time!