全文: HTML (1 KB)

文章导读

摘要电耗预测是原油管道运行能耗管理的重要依据，有助于输油企业制定批量调度与负荷分配等运行方案。相较于工艺计算和统计分析等传统预测方法，机器学习方法在处理高维、非线性的管道运行数据时具有更优的预测效果。但由于数据获取成本很高、数据存在安全保密性等原因，往往将造成可获取的管道数据集是小样本，以此建立的模型预测精度难以满足实际生产需求。为提高模型在小样本集情况下的预测能力，通过利用数据生成理论提出一种自取法和支持向量机相结合的管道运行电耗预测模型。利用自取法对原始小样本集数据进行扩充，根据原始数据集的分布规律生成虚拟样本，填充样本信息间隔，避免出现过拟合问题；使用粒子群算法对支持向量机的超参数进行优化，提高模型的拟合能力。以国内某保温原油管道的两站场为例进行建模预测分析，预测结果表明，相较于只利用原始数据集，添加虚拟样本后多数预测值更加贴近真实值，且当两站场分别加入 50 组虚拟样本后，其月度电耗预测结果的平均绝对误差(MAE)分别降低了 32.38%和 29.74%，证明通过向原始数据集中添加虚拟样本以扩充数据集规模，能够有效降低预测误差，提高模型的拟合能力，这为管道数据获取成本过高、企业重视数据安全等原因造成的可用样本不充足问题提供了一种新的解决思路。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器

关键词 ：原油管道；电耗预测；自取法；支持向量机；小样本；虚拟样本

Abstract：

In general, accurate power consumption prediction is a very important basis for the energy consumption management of a crude oil pipeline operation This is extremely helpful for oil transportation enterprises to reasonably formulate batch scheduling, load distribution and other operation schemes. In general, traditional prediction methods such as process calculation and statistical analysis do not perform very well in processing high-dimensional and non-linear pipeline operation data. In contrast, machine learning methods have better prediction effects under these complex conditions. However, due to the very high cost of data acquisition and the existence of security and confidentiality of the pipeline data, the pipeline data set that can be obtained is often a very small sample data set, so the prediction accuracy of the model established by this method cannot meet the strict requirements of actual production. Therefore, in order to improve the prediction ability of the established prediction models in the case of small sample sets, according to the data generation theory, a pipeline operation power consumption prediction model combining a bootstrap method and a support vector machine is proposed. Firstly, the data of the original small sample set is expanded by the bootstrap method, and virtual samples are generated according to the distribution law of the original data set, and the sample information interval is filled to avoid the problem of over-fitting. Then particle swarm optimization is used to optimize the hyperparameters of the support vector machine to improve the fitting ability of the model. In this paper, a two-station model of an insulated crude oil pipeline in China is taken as an example. As expected, the prediction results show that compared to using only the original data set, most of the predicted values after adding virtual samples are closer to the real values, and when 50 groups of virtual samples were added to the two stations, the average absolute error (MAE) of its monthly

power consumption forecast results were reduced by 32.4 % and 29.7 % , thus proving that by adding the virtual samples to the original data set to expand the scale of data set, it can effectively reduce the prediction error and increase the ability of model fitting. In summary, this method provides a new way to solve the complex problem of insufficient available samples caused by the high cost of pipeline data acquisition and the importance enterprises attach to the data security.

Key words： crude oil pipeline; energy consumption prediction; bootstrap; support vector machine; small sample; virtual sample

收稿日期: 2021-03-31

PACS:

基金资助:

通讯作者: houleicup@126.com

引用本文:

ZHU Zhenyu, BAI Xiaozhong, XU Lei, HOU Lei, LIU Jinhai, GU Wenyuan, SUN Xin. Medium term prediction of power consumption of a crude oil pipeline based on a bootstrap method and support vector machine theory. Petroleum Science Bulletin, 2021, 01: 127-137.

链接本文: