About This Dashboard
Purpose
This site consolidates service and infrastructure statistics for the UCLA Library Data Science Center (DSC). It is rendered from source data and published as a static site on GitHub Pages.
Primary audiences:
- DSC leadership and library administration — for annual reporting, program review, and resource planning
- DSC staff — to understand service load, trends, and data limitations
- Researchers and partners — who want to understand how DSC tracks and reports impact
Update Cadence
Each page shows a “Last rendered” date at the bottom reflecting when data was baked in — this is not a live feed. The site is re-rendered manually when source data is refreshed.
| Data stream | Typical refresh |
|---|---|
| Consulting (LibInsight) | At semester end or on request |
| DataSquad walk-in sign-ins | With each form export |
| Instruction (LibInsight workshops) | At end of academic year |
| AWS costs | Monthly (Cost Explorer API pull) |
| UCLA Dataverse metrics | Monthly (Dataverse API pull) |
| S3 storage (CloudWatch) | Real-time at render |
To request a data refresh, open an issue in the GitHub repository or contact the site maintainer.
Data Sources
| Source | What it tracks | Location |
|---|---|---|
| LibInsight (Springshare) | DSC scheduled consultations; workshop attendance | data/processed/consultations/ |
| Manual LibInsight logs | Walk-in and complex appointments entered manually | Merged into LibInsight export |
| DataSquad sign-in forms | Walk-in consultations at the DataSquad desk | data/processed/consultations/ |
| Trello | DataSquad project task cards and activity (2020–2024) | data/processed/consultations/ |
| GitHub Projects | DataSquad task tracking (2024+; planned integration) | Not yet integrated |
| Jira Service Desk | Service requests (planned integration) | Not yet integrated |
| AWS Cost Explorer API | Monthly cloud spend by service/application | data/raw/infrastructure/AWS_Costs/ |
| UCLA Dataverse API | Cumulative datasets, files, and download counts | data/raw/infrastructure/ |
| AWS CloudWatch | S3 bucket storage sizes | Fetched live at render |
Definitions
Consultation A direct interaction between a DSC staff member or DataSquad student and a researcher. Includes scheduled appointments tracked in LibInsight (Calendly-integrated) and walk-in sign-ins at the DataSquad desk. Excludes canceled appointments (status = Canceled in LibInsight).
Task card A Trello card or GitHub Projects item representing a unit of assigned project work for a DataSquad member. Task cards are not consultations — they represent follow-on or project-based work, often without a direct patron interaction at the time of the task.
Task activity (comment-weighted) A count of comments on task cards, used as a rough workload proxy. This is an activity signal, not equivalent to hours worked, tasks completed, or consultations.
Workshop attendee One participant registration record in a LibInsight event export. A single person attending three workshops is counted three times — this reflects total attendee-events, not unique individuals.
Dataset (Dataverse) A published data package in UCLA Dataverse — the native Dataverse object that groups related files under a single persistent identifier (DOI).
File (Dataverse) An individual file within a dataset. One dataset may contain one or many files.
Download (Dataverse) A file-level download event recorded by the Dataverse metrics API. Counts include both researcher-initiated and automated/programmatic downloads (e.g., crawlers, API clients) and are not filtered for human-only access.
Limitations and Caveats
- Walk-in undercounting: Walk-in consultations may be undercounted in periods where DataSquad sign-in forms were not consistently used.
- Manual log deduplication: Manual LibInsight entries and scheduled exports are deduplicated with a same-provider/same-minute heuristic. A small number of genuine duplicates may remain.
- Trello coverage gap: Trello task metrics reflect the period when Trello was the primary DataSquad task system. The transition to GitHub Projects creates a data discontinuity; cross-system comparisons require care.
- AWS billing lag: Cost Explorer data for the most recent 1–2 months is an estimate; final billed amounts may differ after the billing period closes.
- Dataverse bot downloads: Download counts are not filtered for automated access. Cumulative totals likely include a fraction of non-human downloads.
- Instruction completeness: Workshop attendance completeness depends on consistent LibInsight data entry. Some events may be missing from exports.
Counting Rules Reference
The Metrics Governance document defines:
- Metric naming conventions (encoding who, domain, behavior, and unit)
- Approved and prohibited metric combinations
- Source transition policy (Trello → GitHub Projects)
- Data stewardship rules (immutable raw exports, reproducible processed data)
How to Cite or Reuse These Metrics
When referencing figures from this dashboard in reports or presentations:
UCLA Library Data Science Center. (Year). DSC Statistics Dashboard. UCLA Library. Retrieved [date] from [site URL].
Include the data coverage period (e.g., “2021–present”) and note the source system (e.g., “LibInsight scheduled exports”) when quoting specific numbers. For methodology questions, consult the Metrics Governance document or contact the DSC.
Repository and Contact
- Source code: github.com/UCLALibrary/dsc-stats-reports (update to actual URL)
- Issues / contributions: Open a pull request or issue in the repository
- General inquiries: dsc@library.ucla.edu (confirm address)
Built with Quarto. Data processed with R and Python.