No description
Find a file
2026-03-31 11:14:07 +03:00
grafana initial 2026-03-31 11:14:07 +03:00
src initial 2026-03-31 11:14:07 +03:00
.dockerignore add src and docker stuff 2025-12-08 15:58:18 +02:00
.env.docker.example add src and docker stuff 2025-12-08 15:58:18 +02:00
.env.example initial 2026-03-31 11:14:07 +03:00
AUDIO_VIDEO_FEATURE.md add src and docker stuff 2025-12-08 15:58:18 +02:00
Cargo.lock initial 2026-03-31 11:14:07 +03:00
Cargo.toml initial 2026-03-31 11:14:07 +03:00
CHANGELOG.md add src and docker stuff 2025-12-08 15:58:18 +02:00
CHANGES.md add src and docker stuff 2025-12-08 15:58:18 +02:00
docker-compose.yml Update docker-compose.yml 2025-12-09 12:36:20 +02:00
Dockerfile add src and docker stuff 2025-12-08 15:58:18 +02:00
README.md initial 2026-03-31 11:14:07 +03:00

M365 Activity Collector

A Rust-based service that collects Microsoft 365 user activity metrics and exposes them to Prometheus.

Architecture

This service operates as a continuous daemon with two main components:

  1. Background Sync Loop: Periodically fetches M365 activity data (email and Teams usage) from Microsoft Graph API and stores it in memory and optionally in PostgreSQL
  2. Metrics Server: Exposes aggregated metrics via HTTP endpoint for Prometheus scraping

Storage Options

In-Memory Storage (Default)

All activity data is stored in memory using Arc<RwLock<HashMap>>. The data structure uses (NaiveDate, String) as the key (date + user principal name) to track daily activity per user.

  • No database required: Prometheus itself acts as the time-series database
  • Automatic cleanup: Data older than 30 days is automatically removed from memory
  • Thread-safe: Uses RwLock to allow concurrent reads from metrics endpoint while sync loop updates data

PostgreSQL Storage (Optional)

For historical time-series analysis, the application can also store data in PostgreSQL with proper timestamps:

  • Historical Data: Maintains 30-day rolling window with timestamps for Grafana time-series queries
  • Grafana Integration: Query directly from Grafana using PostgreSQL data source
  • SQL Queries: Use standard SQL for complex analysis, joins, and per-user historical trends
  • Persistence: Data survives application restarts
  • Backward Compatible: Works alongside Prometheus metrics (both can be used simultaneously)

See POSTGRES_QUICKSTART.md for 5-minute setup guide.

Features

  • Email activity tracking (send, receive, read counts)
  • Microsoft Teams activity tracking (team chats, private chats, calls, meetings)
  • Teams communication metrics (audio duration, video duration, screenshare duration)
  • Azure DevOps git commit tracking
  • Per-user metrics with 30-day rolling aggregation
  • Automatic daily data synchronization
  • Prometheus-compatible metrics endpoint
  • NEW: Optional PostgreSQL storage with timestamps for Grafana time-series queries

Prerequisites

  • Rust 1.70+ (for development)
  • Azure AD application with Microsoft Graph API permissions:
    • Reports.Read.All (Application permission)
  • Valid Azure AD tenant with Microsoft 365 licenses
  • Azure DevOps Personal Access Token (PAT) with Code (Read) permissions (for git tracking, optional)

Configuration

Copy .env.example to .env and configure:

# Azure AD / Entra ID credentials
TENANT_ID=your-tenant-id-here
CLIENT_ID=your-client-id-here
CLIENT_SECRET=your-client-secret-here

# Azure DevOps configuration for git commit tracking
AZDO_PAT=your-azdo-pat-here
AZDO_ORGANIZATION=your-organization-name
# AZDO_PROJECT=your-project-name  # Optional: leave unset to fetch all projects in org

# Metrics server bind address
BIND_ADDR=0.0.0.0:3000

# Sync interval in seconds (how often to fetch M365 data)
SYNC_INT=3600

# Collection date offset (days back from today)
# Default is 2 days to avoid Graph API reporting delays
COLLECTION_DAYS_OFFSET=2

# Optional: PostgreSQL database for historical time-series data
# DATABASE_URL=postgresql://m365user:password@localhost:5432/m365_activity

Quick Setup with PostgreSQL (Optional)

For Grafana time-series queries with proper timestamps:

# 1. Start PostgreSQL with Docker
./setup-postgres.sh

# 2. Add DATABASE_URL to .env (script will prompt you)

# 3. Run application
cargo run

See POSTGRES_QUICKSTART.md for detailed setup guide.

Environment Variables

Variable Required Default Description
TENANT_ID Yes - Azure AD tenant ID
CLIENT_ID Yes - Azure AD application client ID
CLIENT_SECRET Yes - Azure AD application client secret
AZDO_PAT No* - Azure DevOps personal access token (Code Read permission)
AZDO_ORGANIZATION No* - Azure DevOps organization name
AZDO_PROJECT No - Azure DevOps project name (if unset, fetches all projects)
BIND_ADDR Yes - Address to bind metrics server (e.g., 0.0.0.0:3000)
SYNC_INT No 3600 Sync interval in seconds
COLLECTION_DAYS_OFFSET No 2 Days to offset from current date

** Required only if you want git commit tracking

Building

cargo build --release

Running

cargo run --release

Or run the compiled binary:

./target/release/m365-activity-collector

Metrics Endpoint

The service exposes metrics at http://<BIND_ADDR>/metrics

Available Metrics

All metrics represent 30-day rolling totals per user:

Metric Name Type Labels Description
m365_email_send_total Gauge user Total emails sent
m365_email_receive_total Gauge user Total emails received
m365_email_read_total Gauge user Total emails read
m365_teams_team_chat_total Gauge user Total team chat messages
m365_teams_private_chat_total Gauge user Total private chat messages
m365_teams_call_total Gauge user Total Teams calls
m365_teams_meeting_total Gauge user Total Teams meetings attended
m365_teams_audio_duration_minutes Gauge user Total audio duration in minutes
m365_teams_video_duration_minutes Gauge user Total video duration in minutes
m365_teams_screenshare_duration_minutes Gauge user Total screenshare duration in minutes
m365_git_commits_total Gauge user Total git commits across all Azure DevOps projects in organization

Example Metrics Output

# HELP m365_email_send_total Email send count over last 30 days
# TYPE m365_email_send_total gauge
m365_email_send_total{user="john.doe@example.com"} 142
m365_email_send_total{user="jane.smith@example.com"} 89

# HELP m365_teams_team_chat_total Teams team chats over last 30 days
# TYPE m365_teams_team_chat_total gauge
m365_teams_team_chat_total{user="john.doe@example.com"} 523
m365_teams_team_chat_total{user="jane.smith@example.com"} 412

# HELP m365_teams_audio_duration_minutes Teams audio duration in minutes over last 30 days
# TYPE m365_teams_audio_duration_minutes gauge
m365_teams_audio_duration_minutes{user="john.doe@example.com"} 1250
m365_teams_audio_duration_minutes{user="jane.smith@example.com"} 980

# HELP m365_teams_video_duration_minutes Teams video duration in minutes over last 30 days
# TYPE m365_teams_video_duration_minutes gauge
m365_teams_video_duration_minutes{user="john.doe@example.com"} 420
m365_teams_video_duration_minutes{user="jane.smith@example.com"} 650

# HELP m365_teams_screenshare_duration_minutes Teams screenshare duration in minutes over last 30 days
# TYPE m365_teams_screenshare_duration_minutes gauge
m365_teams_screenshare_duration_minutes{user="john.doe@example.com"} 180
m365_teams_screenshare_duration_minutes{user="jane.smith@example.com"} 210

# HELP m365_git_commits_total Git commits over last 30 days
# TYPE m365_git_commits_total gauge
m365_git_commits_total{user="john.doe@example.com"} 87
m365_git_commits_total{user="jane.smith@example.com"} 142

Prometheus Configuration

Add this job to your prometheus.yml:

scrape_configs:
  - job_name: 'm365-activity'
    scrape_interval: 60s
    static_configs:
      - targets: ['localhost:3000']

Example Prometheus Queries

Top 10 users by email activity:

topk(10, m365_email_send_total + m365_email_receive_total)

Users with most Teams meetings:

topk(10, m365_teams_meeting_total)

Total audio/video time per user (in hours):

(m365_teams_audio_duration_minutes + m365_teams_video_duration_minutes) / 60

Users spending most time on video calls:

topk(10, m365_teams_video_duration_minutes)

Average screenshare duration per meeting:

m365_teams_screenshare_duration_minutes / m365_teams_meeting_total

Users with high collaboration (chats + meetings):

topk(10, m365_teams_team_chat_total + m365_teams_private_chat_total + m365_teams_meeting_total)

Total organization-wide communication time (hours):

sum(m365_teams_audio_duration_minutes + m365_teams_video_duration_minutes) / 60

Top 10 developers by git commits:

topk(10, m365_git_commits_total)

Total organization git activity:

sum(m365_git_commits_total)

How It Works

  1. Startup: Service initializes and creates in-memory storage
  2. Sync Loop: Every SYNC_INT seconds:
    • Authenticates with Azure AD
    • Fetches email activity CSV from Graph API
    • Fetches Teams activity CSV from Graph API
    • Fetches git commits from Azure DevOps REST API
    • Parses all data
    • Updates in-memory HashMap with latest data
    • Removes entries older than 30 days
  3. Metrics Scraping: When Prometheus scrapes /metrics:
    • Reads from in-memory storage
    • Aggregates last 30 days per user
    • Exposes metrics in Prometheus format

Data Flow

Microsoft Graph API       Azure DevOps API
        ↓                        ↓
        └──────── Sync Loop ─────┘
              (every hour)
                   ↓
          ┌────────┴────────┐
          ↓                 ↓
   In-Memory HashMap    PostgreSQL (optional)
  (30-day rolling)     (with timestamps)
          ↓                 ↓
   Metrics Endpoint    Grafana SQL Queries
          ↓
     Prometheus

Storage Options:

  • In-Memory + Prometheus: Default, no database needed
  • PostgreSQL + Grafana: Optional, for historical time-series analysis
  • Both: Recommended for comprehensive monitoring and analysis

Memory Usage

Memory usage is proportional to:

  • Number of active users
  • 30 days of daily data per user

Typical memory footprint: < 10 MB for organizations with hundreds of users.

Error Handling

  • Sync errors are logged but don't crash the service
  • Metrics endpoint remains available even if sync fails
  • Authentication tokens are refreshed automatically
  • CSV parsing errors are logged with context

Logging

The service prints to stdout:

  • Sync start/completion with user counts
  • Total entries in memory after each sync
  • Any sync errors encountered

Future Enhancements

  • SharePoint activity metrics
  • OneDrive usage metrics
  • GitHub integration for git commits
  • Pull request and code review metrics
  • Configurable retention period
  • Health check endpoint
  • Structured logging with log levels

Documentation

Quick Start Guides

Comprehensive Guides

Technical Documentation

Feature Documentation

Grafana Dashboards

  • grafana/ - Pre-built Grafana dashboard JSON files

License

[Your License Here]