Metrics Usage Guide
Basic Usage
All implemented metrics are open for anyone to use, the root URL of OpenDigger static data is
https://oss.open-digger.cn/{platform}/{org/login}/{repo}
You can use github
or gitee
for platform, then just replace the org/repo
or user login
to get your data.
Below is a complete list of metric data, and you can try the data on the Playground page, or refer to the corresponding documentation page for specific data.
OpenRank
Statistics
Developers
Issues
Change Requests
Data Structure
The exported metrics file has a basic data structure as a JSON object, with keys representing monthly, quarterly, and yearly values corresponding to the metric data.
- Monthly keys are in the format
YYYY-MM
. - Quarterly keys are in the format
YYYYQX
, whereX
ranges from 1 to 4, withQ1
covering January to March of that year,Q2
April to June, and so on. - Yearly keys are in the format
YYYY
.
All keys are at the top-level structure, arranged in the order of year, month, and quarter, with each type sorted chronologically.
For example, for the OpenDigger repository, its global OpenRank data is:
{
"2020":34.81,"2021":55.59,"2022":92.97,... // Yearly data
"2020-08":4.91,"2020-09":5.17,"2020-10":5.1,... // Monthly data
"2020Q3":10.08,"2020Q4":24.73,"2021Q1":22.18,... // Quarterly data
}
Due to the existence of missing data in OpenDigger's GitHub data source, if the key
2021-10-raw
exists, it represents the raw value of the metric data. To ensure temporal continuity in the metric data, the corresponding2021-10
metric value is calculated as an interpolation result based on the values from two months before and after. For specific code, please refer to here.
Export Range
OpenDigger does not export metrics data for all repositories and users. The specific exported repositories and user lists can be found in repo_list.csv
and user_list.csv
, where:
- The structure of
repo_list.csv
rows isid,platform,repo_name
, meaning database ID, platform name, and full repository name separated by a comma. - The structure of
user_list.csv
rows isid,platform,login
, meaning database ID, platform name, and user login name separated by a comma.
The database ID is a unique ID on the corresponding platform, consistent with the data of each platform; the platform name and database ID uniquely identify a repository or user.
For OpenDigger's export strategy, please refer to the export table section in the developer documentation.
The search components for various repositories and users on the OpenDigger official webpage use these two files for local browsing.
Metadata
For the exported repositories and users, OpenDigger also exports a metadata file available at:
Repository: https://oss.open-digger.cn/{platform}/{org/login}/meta.json
(example: OpenDigger metadata)
User: https://oss.open-digger.cn/{platform}/{org/login}/{repo}/meta.json
(example: User metadata)
The metadata includes the following fields:
updatedAt
: Data update timestamp.type
: Repository or user type, values arerepo
oruser
.id
: Unique ID of the repository or user in the platform's database.labels
: Tags associated with the repository or user in OpenDigger. For details, please refer to the relevant documentation on tag data.
For example, for the OpenDigger repository, its metadata is:
{
"updatedAt": 1725221391661, // Update timestamp
"type": "repo", // Metadata type
"id": 288431943, // Unique ID of the repository on GitHub
"labels": [ // Tags associated with the repository
{
"id": ":communities/mulan", // Mulan Community
"name": "Mulan",
"type": "Community"
},
{
"id": ":communities/xlab", // X-lab Community
"name": "X-lab",
"type": "Community"
},
{
"id": ":communities/xlab/open_digger", // OpenDigger Project
"name": "OpenDigger",
"type": "Project"
},
{
"id": ":regions/CN", // China Project
"name": "China",
"type": "Region"
}
]
}
FAQ
Q: Does OpenDigger's metric data support integration into other applications?
A: Yes, OpenDigger's metric data is open for downstream application integration. OpenDigger adds Access-Control-Allow-Origin: *
to the response header of exported static data, ensuring cross-origin usage. If your website has strict domain requirements for response headers, you may need to implement a service for data forwarding. In fact, OpenDigger data is already used in many downstream applications such as HyperCRX, OpenLeaderboard, OSGraph, etc.
Q: Do applications need to implement their own caching strategy for metric data?
A: Browser applications can directly use the data without caching. OpenDigger includes the Expires
field in the response header for metric data, allowing browsers to determine that they can retrieve data directly from disk cache until the 2nd of the next month after a monthly update, avoiding redundant remote data requests.
Q: What might cause a metric file to be unavailable when accessed?
A: If a metric file is unavailable, it may be due to the following reasons:
- The repository or user is not within the export range. You can check metadata to determine if they are included in the export range; if metadata exists, then the data has been exported.
- The repository does not have corresponding events. For example, some Apache projects do not use GitHub Issues (e.g., Flink), and thus relevant metric files may not exist.
Q: Why might some specific months or quarters be missing keys in the metric file?
A: If a repository or user does not have certain types of events during a specific period, the corresponding keys in the metrics data will be absent. The metrics data will not contain 0
values, leading to potential discontinuities in the keys.