Dataset and Variables
Dataset
The dataset used in this project, MC1_graph.json, is a JSON file generated by Python’s network.node_link_data() function. It can likewise be loaded to a networkx object using the corresponding node_link_graph() function. The root-level JSON object consists of graph-level properties specifying that it is directed and a multigraph, a “nodes” key which holds the list of nodes, and a “links” key which holds the list of edges.
Nodes
The nodes dataset contains 17,412 entries, each representing an entity within the music network and categorized under the Node Type column as “Person”, “Song”, or “RecordLabel”. Each node includes relevant attributes based on its type—for example, songs have fields such as single, release_date, genre, and notable, while people may have stage_name and notoriety_date. Please refer to following table for more details.
Nodes Type | Description | Attributes |
---|---|---|
Person | These can be anyone in the music industry, including singers, producers, instrumentalists, composers, etc. |
|
MusicalGroup | Bands, quartets, small choirs, or other officially organized entities formed by musicians to make music. |
|
RecordLabel | These are organizations—professional, commercial, or otherwise institutional—involved in the recording, production, or distribution of the music. |
|
Song | Music song |
|
Album | Music Album |
|
Edges
The edges dataset contains 37,857 records and 4 fields to represent the various relationships between entities in the network. Each edge contains the node IDs (source and target) of the starting and ending points, as well as 12 Edge Types describing the nature of the relationship, such as “PerformerOf”, ‘ComposerOf’ or “RecordedBy”. Meanwhile, the key field is used to distinguish between multiple connections between the same node pair. Please refer to following table for more details.
Edge Type | Description |
---|---|
PerformerOf | Indicates a that the source node (Person or MusicalGroup) performed the destination node (Song or Album) |
ComposerOf | Indicates that the source node (Person) composed the destination node (Song or Album) |
ProducerOf | Indicates that the source node (Person or RecordLabel) participated in the production of the destination node’s work (Song, Album, Person, or MusicalGroup) |
LyricistOf | Indicates that the source node (Person) wrote lyrics for the destination node (Song or Album) |
RecordedBy | Indicates that the destination node (RecordLabel) aided in the recording process for the source node (Song or Album) |
DistributedBy | Indicates that the destination node (RecordLabel) aided in the distribution process for the source node (Song or Album) |
InStyleOf | Indicates that the source node (Song or Album) was performed at least partly in the style of the destination node (Song, Album, Person, or MusicalGroup) |
InterpolatesFrom | Indicates that the source node (Song or Album) interpolated a melody from the destination node (Song or Album). |
CoverOf | Indicates that the source node (Song or Album) is a cover of the destination node (Song or Album) |
LyricalReferenceTo | Indicates that the source node (Song or Album) makes a lyrical reference to the destination node (Song or Album) |
DirectlySamples | Indicates that the source node (Song or Album) consists of (an) audio recording(s) that directly reuse a portion of the audio recording of the destination node (Song or Album) via sampling |
MemberOf | Indicates that the source node (Person) is (or was) a member of the destination node (MusicalGroup) |