publications
2024
- Open-source sky image datasets for solar forecasting with deep learning: A comprehensive surveyYuhao Nie, Xiatong Li , Quentin Paletta , and 3 more authorsRenewable and Sustainable Energy Reviews, 2024
Sky image-based solar forecasting using deep learning has been recognized as a promising approach in reducing the uncertainty of solar power generation. However, a major challenge is the lack of large quantity of sky image data encompassing diverse sky conditions for model training. This study presents a comprehensive survey of open-source sky image datasets for solar forecasting and related research areas, including cloud segmentation, classification and motion prediction which could potentially enhance solar forecasting capabilities. In total, 72 open-source sky image datasets are identified globally that satisfy the needs of deep learning-based method development. A database containing information about various aspects of the identified datasets is constructed. A multi-criteria ranking system is further developed to evaluate each dataset based on eight dimensions which could have important impacts on the data usage. Finally, insights on the applications of these datasets are provided. This study streamlines the processes of identifying and selecting sky image datasets, and could potentially accelerate the method development and benchmark in solar forecasting and related fields including energy meteorology and atmospheric science.
2023
- Advances in solar forecasting: Computer vision with deep learningQuentin Paletta , Guillermo Terrén-Serrano , Yuhao Nie, and 6 more authorsAdvances in Applied Energy, 2023
Renewable energy forecasting is crucial for integrating variable energy sources into the grid. It allows power systems to address the intermittency of the energy supply at different spatiotemporal scales. To anticipate the future impact of cloud displacements on the energy generated by solar facilities, conventional modeling methods rely on numerical weather prediction or physical models, which have difficulties in assimilating cloud information and learning systematic biases. Augmenting computer vision with machine learning overcomes some of these limitations by fusing real-time cloud cover observations with surface measurements acquired from multiple sources. This Review summarizes recent progress in solar forecasting from multisensor Earth observations with a focus on deep learning, which provides the necessary theoretical framework to develop architectures capable of extracting relevant information from data generated by ground-level sky cameras, satellites, weather stations, and sensor networks. Overall, machine learning has the potential to significantly improve the accuracy and robustness of solar energy meteorology; however, more research is necessary to realize this potential and address its limitations.
- SkyGPT: Probabilistic short-term solar forecasting using synthetic sky videos from physics-constrained VideoGPTYuhao Nie, Eric Zelikman , Andea Scott , and 2 more authorsarXiv preprint arXiv:2306.11682, 2023
In recent years, deep learning-based solar forecasting using all-sky images has emerged as a promising approach for alleviating uncertainty in PV power generation. However, the stochastic nature of cloud movement remains a major challenge for accurate and reliable solar forecasting. With the recent advances in generative artificial intelligence, the synthesis of visually plausible yet diversified sky videos has potential for aiding in forecasts. In this study, we introduce \emphSkyGPT, a physics-informed stochastic video prediction model that is able to generate multiple possible future images of the sky with diverse cloud motion patterns, by using past sky image sequences as input. Extensive experiments and comparison with benchmark video prediction models demonstrate the effectiveness of the proposed model in capturing cloud dynamics and generating future sky images with high realism and diversity. Furthermore, we feed the generated future sky images from the video prediction models for 15-minute-ahead probabilistic solar forecasting for a 30-kW roof-top PV system, and compare it with an end-to-end deep learning baseline model SUNSET and a smart persistence model. Better PV output prediction reliability and sharpness is observed by using the predicted sky images generated with SkyGPT compared with other benchmark models, achieving a continuous ranked probability score (CRPS) of 2.81 (13% better than SUNSET and 23% better than smart persistence) and a Winkler score of 26.70 for the test set. Although an arbitrary number of futures can be generated from a historical sky image sequence, the results suggest that 10 future scenarios is a good choice that balances probabilistic solar forecasting performance and computational cost.
- SKIPP’D: A SKy Images and Photovoltaic Power generation Dataset for short-term solar forecastingYuhao Nie, Xiatong Li , Andea Scott , and 3 more authorsSolar Energy, 2023
Large-scale integration of photovoltaics (PV) into electricity grids is challenged by the intermittent nature of solar power. Sky-image-based solar forecasting using deep learning has been recognized as a promising approach to predicting the short-term fluctuations. However, there are few publicly available standardized benchmark datasets for image-based solar forecasting, which limits the comparison of different forecasting models and the exploration of forecasting methods. To fill these gaps, we introduce SKIPP’D—a SKy Images and Photovoltaic Power Generation Dataset. The dataset contains three years (2017–2019) of quality-controlled down-sampled sky images and PV power generation data that is ready-to-use for short-term solar forecasting using deep learning. In addition, to support the flexibility in research, we provide the high resolution, high frequency sky images and PV power generation data as well as the concurrent sky video footage. We also include a code base containing data processing scripts and baseline model implementations for researchers to reproduce our previous work and accelerate their research in solar forecasting.
2022
- Sky-image-based solar forecasting using deep learning with multi-location data: training models locally, globally or via transfer learning?Yuhao Nie, Quentin Paletta , Andea Scott , and 5 more authors2022
Solar forecasting from ground-based sky images has shown great promise in reducing the uncertainty in solar power generation. With more and more sky image datasets open sourced in recent years, the development of accurate and reliable deep learning-based solar forecasting methods has seen a huge growth in potential. In this study, we explore three different training strategies for solar forecasting models by leveraging three heterogeneous datasets collected globally with different climate patterns. Specifically, we compare the performance of local models trained individually based on single datasets and global models trained jointly based on the fusion of multiple datasets, and further examine the knowledge transfer from pre-trained solar forecasting models to a new dataset of interest. The results suggest that the local models work well when deployed locally, but significant errors are observed when applied offsite. The global model can adapt well to individual locations at the cost of a potential increase in training efforts. Pre-training models on a large and diversified source dataset and transferring to a target dataset generally achieves superior performance over the other two strategies. With 80% less training data, it can achieve comparable performance as the local baseline trained using the entire dataset.
2021
- Resampling and data augmentation for short-term PV output prediction based on an imbalanced sky images dataset using convolutional neural networksYuhao Nie, Ahmed S. Zamzam , and Adam BrandtSolar Energy, 2021
Integrating photovoltaics (PV) into electricity grids is challenged by potentially large fluctuations in power generation. In recent years, sky image-based PV output prediction using convolutional neural networks (CNNs) has emerged as a promising approach to forecasting fluctuations. A key challenge is imbalanced sky image datasets: because of the geography of solar PV system installations, sky image datasets are often rich in sunny condition data but deficient in cloudy condition data. This imbalance contrasts with the fact that model errors are dominated by cloudy condition performance. In this study, we attempt to remedy this by exploring the enrichment and augmentation of an imbalanced sky images dataset for two PV output prediction tasks: nowcasting (predicting concurrent PV output) and forecasting (predicting 15-minute-ahead future PV output). We empirically examine the efficacy of using different resampling and data augmentation approaches to create a rebalanced dataset for model development. A three-stage greedy search is used to determine the optimal resampling approach, data augmentation techniques and over-sampling rate. The results show that for the nowcast problem, resampling and data augmentation can effectively enhance the model performance, reducing overall root mean squared error (RMSE) by an average of 4%, or a 15 std. (standard deviation) of improvement compared to the variability of the baseline model. In contrast, the treatment RMSE for the forecast problem nearly always overlaps the baseline performance at the ± 2 std. level. The optimal resampling approach expands on the original dataset by over-sampling the minority cloudy data, with the best results from large over-sampling rate (e.g., 4 6 times over-sampling of cloudy images).
- ES&TGreenhouse gas emissions of Western Canadian natural gas: Proposed emissions tracking for life cycle modelingRyan E. Liu , Arvind P. Ravikumar , Xiaotao Tony Bi , and 4 more authorsEnvironmental Science & Technology, 2021
Natural gas (NG) produced in Western Canada is a major and growing source of Canada’s energy and greenhouse gas (GHG) emissions portfolio. Despite recent progress, there is still only limited understanding of the sources and drivers of Western Canadian greenhouse gas (GHG) emissions. We conduct a case study of a production facility based on Seven Generation Energy Ltd.’s Western Canadian operations and an upstream NG emissions intensity model. The case study upstream emissions intensity is estimated to be 3.1–4.0 gCO2e/MJ NG compared to current best estimates of British Columbia (BC) emissions intensities of 6.2–12 gCO2e/MJ NG and a US average estimate of 15 gCO2e/MJ. The analysis reveals that compared to US studies, public GHG emissions data for Western Canada is insufficient as current public data satisfies only 50% of typical LCA model inputs. Company provided data closes most of these gaps (∼80% of the model inputs). We recommend more detailed data collection and presentation of government reported data such as a breakdown of vented and fugitive methane emissions by source. We propose a data collection template to facilitate improved GHG emissions intensity estimates and insight about potential mitigation strategies.
2020
- PV power output prediction from sky images using convolutional neural network: The comparison of sky-condition-specific sub-models and an end-to-end modelYuhao Nie, Yuchi Sun , Yuanlei Chen , and 2 more authorsJournal of Renewable and Sustainable Energy, 2020
Photovoltaics (PV), the primary use of solar energy, is growing rapidly. However, the variable output of PV under changing weather conditions may hinder the large-scale deployment of PV. In this study, we propose a two-stage classification-prediction framework to predict contemporaneous PV power output from sky images (a so-called “nowcast”), and compare it with an end-to-end convolution neural network (CNN). The proposed framework first classifies input images into different sky conditions and then the classified images are sent to specific sub-models for PV output prediction. Two types of classifiers are developed and compared: (1) a CNN-based classifier trained on clear sky index (CSI)-labeled sky images and (2) a physics-based non-parametric classifier based on a threshold of fractional cloudiness of sky images. Different numbers of classification categories are also examined. The results suggest that the cloudiness-based classifier is more suitable than the CSI-based classifier for the framework, and the 3-class classification (i.e., sunny, cloudy, overcast) is found to be the optimal choice. We then fine-tune the cloudiness threshold for the non-parametric classifier and tailor the architecture for each sky-condition-specific sub-model. Under the best design, the proposed framework can achieve a root mean squared error (RMSE) of 2.20 kW (relative to a 30 kW rated PV array) on the test set comprising 18 complete days (9 sunny, RMSE = 0.69 kW; 9 cloudy, RMSE = 3.06 kW). Compared with the end-to-end CNN baseline model, the overall prediction performance can be improved by 6% (7% in sunny and 6% in cloudy), with 6% fewer trainable parameters needed in the architecture.
- Greenhouse-gas emissions of Canadian liquefied natural gas for use in China: Comparison and synthesis of three independent life cycle assessmentsYuhao Nie, Siduo Zhang , Ryan Edward Liu , and 7 more authorsJournal of Cleaner Production, 2020
Liquefied natural gas (LNG) is a promising alternative to coal to mitigate the greenhouse gas (GHG) and particulate emissions from power, industry, and district heating in China. While numerous existing life cycle assessment (LCA) studies estimate the GHG footprint of LNG, large variation exists in these results. Such variability could be caused by differing project designs, system boundaries, modeling methods and data sources. It is not clear which of these factors is the most important. Here, three research groups from Canada and the US performed independent LCAs of the same planned LNG supply chain from Canada to China. The teams applied different methods and assumptions but used aligned system boundaries and worked with a single upstream producer to obtain production data. The GHG emissions of Canadian LNG to China for power and heat generation were found to be 427–556 g CO2-eq/kWh and 81–92 g CO2-eq/MJ. Compared with Chinese coal for power generation, 291–687 g CO2-eq (34%–62%) reduction can be achieved per kWh of power generated. The central tendency in each study is aligned more closely than the overall uncertainty range: thus, uncertainty caused by fundamental data challenges likely outweighs variability caused by use of different LCA methods. Differences in assumptions and methods among the three teams lead to moderate variation at the stage level, but in better agreement at the life-cycle level, showing the existence of compensating variation. Given the robustness to very different LCA methods, existing literature variation may be explained by project-, location- and operator-dependent parameters.
- ERLRepeated leak detection and repair surveys reduce methane emissions over scale of yearsArvind P. Ravikumar , Daniel Roda-Stuart , Ryan Liu , and 6 more authorsEnvironmental Research Letters, 2020
Reducing methane emissions from the oil and gas industry is a critical climate action policy tool in Canada and the US. Optical gas imaging-based leak detection and repair (LDAR) surveys are commonly used to address fugitive methane emissions or leaks. Despite widespread use, there is little empirical measurement of the effectiveness of LDAR programs at reducing long-term leakage, especially over the scale of months to years. In this study, we measure the effectiveness of LDAR surveys by quantifying emissions at 36 unconventional liquids-rich natural gas facilities in Alberta, Canada. A representative subset of these 36 facilities were visited twice by the same detection team: an initial survey and a post-repair re-survey occurring ∼0.5–2 years after the initial survey. Overall, total emissions reduced by 44% after one LDAR survey, combining a reduction in fugitive emissions of 22% and vented emissions by 47%. Furthermore, >90% of the leaks found in the initial survey were not emitting in the re-survey, suggesting high repair effectiveness. However, fugitive emissions reduced by only 22% because of new leaks that occurred between the surveys. This indicates a need for frequent, effective, and low-cost LDAR surveys to target new leaks. The large reduction in vent emissions is associated with potentially stochastic changes to tank-related emissions, which contributed ∼45% of all emissions. Our data suggest a key role for tank-specific abatement strategies as an effective way to reduce oil and gas methane emissions. Finally, mitigation policies will also benefit from more definitive classification of leaks and vents.
- IJEPEOptimal design of the power generation network in California: Moving towards 100% renewable electricity by 2045Wennan Long , Yuhao Nie, Yunan Li , and 1 more authorInternational Journal of Energy and Power Engineering, 2020
To fight against climate change, California government issued the Senate Bill No. 100 (SB-100) in 2018 September, which aims at achieving a target of 100% renewable electricity by the end of 2045. A capacity expansion problem is solved in this case study using a binary quadratic programming model. The optimal locations and capacities of the potential renewable power plants (i.e., solar, wind, biomass, geothermal and hydropower), the phase-out schedule of existing fossil-based (nature gas) power plants and the transmission of electricity across the entire network are determined with the minimal total annualized cost measured by net present value (NPV). The results show that the renewable electricity contribution could increase to 85.9% by 2030 and reach 100% by 2035. Fossil-based power plants will be totally phased out around 2035 and solar and wind will finally become the most dominant renewable energy resource in California electricity mix.