Use efficient file formats for AI/ML development
Description
Data processing and storage constitute a significant portion of AI/ML development and impact the carbon footprint of your application. Variety and volumes of data might need to be captured and pre-processed for building the ML model. Efficient storage of the model becomes extremely important to manage the data used for ML model development.
Solution
Use efficient file formats for building your ML models. For instance, column-oriented data file formats like Parquet provide efficient data storage and retrieval as compared to formats like CSV.
SCI Impact
SCI = (E * I) + M per R
Software Carbon Intensity Spec
Using efficient file formats for ML development impacts SCI as follows:
E
: A more efficient file format for ML development means more efficient data storage and retrieval, resulting in lower overall energy consumption.M
: A more efficient file format for ML development reduces the amount of storage space and number of servers needed, resulting in a lower overall embodied carbon.
Assumptions
None
Considerations
Evaluate and consider the most energy efficient formats required for your application.