4. Pre-Processing Module#
This module contains the pre-processing functions of BuckPy.
- class buckpy.buckpy_preprocessing.LBDistributions(*, friction_factor_le, friction_factor_be, friction_factor_he, friction_factor_fit_type)[source]#
Bases:
objectClass for lateral buckling calculations, including friction factor distribution fitting.
- Parameters:
friction_factor_le (float, optional) – Low estimate (LE) friction factor, representing the 5th percentile.
friction_factor_be (float, optional) – Best estimate (BE) friction factor, representing the 50th percentile.
friction_factor_he (float, optional) – High estimate (HE) friction factor, representing the 95th percentile.
friction_factor_fit_type (str, optional) – Type of fit to perform: ‘LE_BE_HE’, ‘LE_BE’, or ‘BE_HE’.
- friction_distribution()[source]#
Compute the parameters of the lognormal friction factor distribution (axial or lateral) by minimizing the root mean square error (RMSE) between geotechnical estimates and back-calculated friction factors from the lognormal distribution.
- Returns:
mean_friction (np.ndarray) – Array of mean values of the lognormal friction factor distribution.
std_friction (np.ndarray) – Array of standard deviation values of the lognormal friction factor distribution.
location_param (np.ndarray) – Array of location parameters of the lognormal friction factor distribution.
scale_param (np.ndarray) – Array of scale parameters of the lognormal friction factor distribution.
le_fit (np.ndarray) – Array of fitted LE values.
be_fit (np.ndarray) – Array of fitted BE values.
he_fit (np.ndarray) – Array of fitted HE values.
rmse (np.ndarray) – Array of RMSE values for the best fit type.
Notes
The function calculates the parameters of the lognormal friction factor distribution based on LE at 5th percentile, BE at 50th percentile, and HE at 95th percentile
Examples
>>> lb = LBDistributions( ... friction_factor_le=[0.5], ... friction_factor_be=[1.0], ... friction_factor_he=[1.5], ... friction_factor_fit_type=['LE_BE_HE'] ... ) >>> lb.friction_distribution() (array([0.9684083]), array([0.30043236]), array([-0.07804666]), array([0.3031342]), array([0.56177265]), array([0.92492127]), array([1.52282131]), array([0.05765844]))
- buckpy.buckpy_preprocessing.calc_element_array(df)[source]#
Function to create element array based on KP, KP TO and element number.
- Parameters:
df (pandas Dataframe) – Dataframe containing the expanded KP values.
- Returns:
df – Dataframe containing the elements between each KP value.
- Return type:
pandas Dataframe
- buckpy.buckpy_preprocessing.calc_expand_kp(df, kp_col)[source]#
Function to expand the KP array with 1000 intervals from 1000 to nearest maximum KP.
- Parameters:
df (pandas Dataframe) – Dataframe containing the original KP values.
kp_col (string) – The column name of the KP values to expand.
- Returns:
df – Dataframe containing the expanded KP values.
- Return type:
pandas Dataframe
- buckpy.buckpy_preprocessing.calc_kp_interpolation(elem_array, df_oper)[source]#
Function to interpolate the RLT, pressure and temperature using KP and operating profile.
- Parameters:
elem_array (np Array) – Array containing the kp value of the elements.
df_oper (pandas Dataframe) – Dataframe containing the original operating profiles data.
- Returns:
df – Dataframe containing the interpolated operating profiles data.
- Return type:
pandas Dataframe
- buckpy.buckpy_preprocessing.calc_lognorm_hoos(type_elt, length_elt, hoos_mean, hoos_std, length_ref, rcm_charac)[source]#
Compute the parameters of the horizontal out-of-straightness (HOOS) lognormal distribution for different types of elements (e.g., Straight, Bend, Sleeper, RCM). This function takes into account the scaling factor of the HOOS distribution. For RCM, the HOOS factor is not a factor but the critical buckling force.
- Parameters:
type_elt (str) – Type of the element.
length_elt (float) – Length of the element.
hoos_mean (float) – Mean of the HOOS distribution.
hoos_std (float) – Standard deviation of the HOOS distribution.
length_ref (float) – Reference length.
rcm_charac (float) – Characteristic buckling force for the Residual Curvature Method (RCM).
- Returns:
x_range (numpy.ndarray) – An array of values representing the range of the friction factor distribution between probabilities of exceedance between 0.01% and 99.99%.
cdf_range (numpy.ndarray) – An array of cumulative density function (CDF) values corresponding to x_range.
Notes
This function computes the parameters of a lognormal distribution for different types of elements such as Straight, Bend, Sleeper, and RCM (Residual Curvature Method). It calculates the cumulative density function (CDF) for the generated range of values based on the HOOS distribution parameters.
- buckpy.buckpy_preprocessing.calc_lognorm_soil(mu_mean, mu_std)[source]#
Compute the parameters of a lognormal distribution for friction factors (axial or lateral).
- Parameters:
mu_mean (float) – The mean of the friction factor distribution.
mu_std (float) – The standard deviation of the friction factor distribution.
- Returns:
mu_range (numpy.ndarray) – An array of values representing the range of the friction factor distribution between probabilities of exceedance between 0.01% and 99.99%.
cdf_range (numpy.ndarray) – An array of cumulative density function (CDF) values corresponding to mu_range.
Notes
The function calculates the shape and scale parameters of a friction factor lognormal distribution based on the provided mean (mu_mean) and standard deviation (mu_std). It then computes the cumulative density function (CDF) for the generated range of values.
- buckpy.buckpy_preprocessing.calc_monte_carlo_data(df, df_ends)[source]#
Convert the scenario data and pipeline end boundary conditions data to NumPy arrays for Monte Carlo simulations.
- Parameters:
df (pandas.DataFrame) – DataFrame containing the scenario data.
df_ends (pandas.DataFrame) – DataFrame containing the pipeline end boundary conditions data.
- Returns:
np_distr (numpy.ndarray) – 2D array with probabilistic distributions (rows) along the route mesh (columns).
np_scen (numpy.ndarray) – 2D array with scenario properties (rows) along the route mesh (columns).
np_ends (numpy.ndarray) – 2D array with end properties (rows) for the pipeline ends.
Notes
The arrays have the following row layout (index : meaning):
- np_distr:
0 : MUAX_ARRAY
1 : MUAX_CDF_ARRAY
2 : MULAT_ARRAY_HT
3 : MULAT_CDF_ARRAY_HT
4 : MULAT_ARRAY_OP
5 : MULAT_CDF_ARRAY_OP
6 : HOOS_ARRAY
7 : HOOS_CDF_ARRAY
- np_scen:
0 : KP
1 : LENGTH
2 : ROUTE_TYPE
3 : BEND_RADIUS
4 : SW_INST
5 : SW_HT
6 : SW_OP
7 : SCHAR_HT
8 : SCHAR_OP
9 : SV_HT
10 : SV_OP
11 : CBF_RCM
12 : RLT
13 : FRF_HT
14 : FRF_P_OP
15 : FRF_T_OP
16 : FRF_OP
17 : L_BUCKLE_HT
18 : EAF_BUCKLE_HT
19 : L_BUCKLE_OP
20 : EAF_BUCKLE_OP
21 : SECTION_ID
22 : SECTION_KP
23 : SECTION_REF
24 : MUAX_MEAN
25 : MULAT_HT_MEAN
26 : MULAT_OP_MEAN
27 : HOOS_MEAN
- np_ends:
0 : ROUTE_TYPE
1 : KP_FROM
2 : KP_TO
3 : REAC_INST
4 : REAC_HT
5 : REAC_OP
- buckpy.buckpy_preprocessing.calc_oper_data(df, df_route_ends, pipeline_set, loadcase_set)[source]#
Calculate operating data and process it.
- Parameters:
df (pandas.DataFrame) – DataFrame containing the operating data.
df_route_ends (pandas.DataFrame) – DataFrame containing the end boundary conditions.
pipeline_set (str) – Identifier of the pipeline set.
loadcase_set (str) – Identifier of the loadcase set.
- Returns:
df – DataFrame containing the operating data and calculated operating data.
- Return type:
pandas.DataFrame
Notes
This function filters df DataFrame based on pipeline_set, loadcase_set, and ‘KP To’. It calculates rolling mean and difference, assigns the ‘Length’ column, resets the index, and drops rows with NaN values before returning the preprocessed DataFrame.
- buckpy.buckpy_preprocessing.calc_operating_profiles(df, df_route, pipeline_set, loadcase_set)[source]#
Calculate operating profiles data and process it.
- Parameters:
df (pandas.DataFrame) – DataFrame containing the operating profiles data.
df_route (pandas.DataFrame) – DataFrame containing route data and calculated route data.
pipeline_set (str) – Identifier of the pipeline set.
loadcase_set (str) – Identifier of the loadcase set.
- Returns:
df – DataFrame containing the operating profiles data and calculated operating data.
- Return type:
pandas.DataFrame
- buckpy.buckpy_preprocessing.calc_pipe_data(df, pipeline_set)[source]#
Calculate properties of pipes for a specific pipeline set.
- Parameters:
df (pandas.DataFrame) – DataFrame containing the pipe data.
pipeline_set (str) – Identifier of the pipeline set.
- Returns:
df – DataFrame containing the pipe data and calculated pipe properties.
- Return type:
pandas.DataFrame
Notes
This function filters the df DataFrame based on the pipeline_set. It computes the inner diameter (ID), cross-sectional area (As), inner area (Ai), moment of inertia (I), hydrotest characteristic buckling force (SChar HT), and operation characteristic buckling force (SChar OP) of the pipe.
- buckpy.buckpy_preprocessing.calc_pp_data(df, np_array, pipeline_id, layout_set)[source]#
Calculate post-processing data set for a given layout set.
- Parameters:
df (pandas.DataFrame) – DataFrame containing post-processing data.
np_array (numpy.ndarray) – NumPy array containing pipeline end boundary conditions.
pipeline_id (str) – Identifier of the pipeline.
layout_set (str) – Identifier of the layout set.
- Returns:
df – DataFrame containing calculated post-processing data.
- Return type:
pandas.DataFrame
Notes
This function filters the DataFrame based on the layout set. It resets the index, renames columns, and selects relevant columns. Adjusts the last ‘KP_to’ value if it is smaller than the maximum value in np_array. Converts data types of columns to appropriate numeric types.
- buckpy.buckpy_preprocessing.calc_route_data(df, layout_set, pipeline_set)[source]#
Extract and process route data for calculations.
- Parameters:
df (pandas.DataFrame) – DataFrame containing route data.
layout_set (str) – Identifier of the layout set.
pipeline_set (str) – Identifier of the pipeline set.
- Returns:
df (pandas.DataFrame) – DataFrame containing route data and calculated route data.
df_ends (pandas.DataFrame) – DataFrame containing end boundary conditions.
Notes
This function extracts route ends and route data based on pipeline_set and layout_set. It selects specific columns for route ends data. Route Type is converted from string to float for numerical representation. Route ends data is converted to a NumPy array for efficient processing.
- buckpy.buckpy_preprocessing.calc_scenario_data(df_route, df_pipe, df_oper, df_soil)[source]#
Calculate scenario data based on route, pipe, operating, and soil data.
- Parameters:
df_route (pandas.DataFrame) – DataFrame containing route data.
df_pipe (pandas.DataFrame) – DataFrame containing pipe data.
df_oper (pandas.DataFrame) – DataFrame containing operating data.
df_soil (pandas.DataFrame) – DataFrame containing soil data.
- Returns:
df – DataFrame containing the calculated scenario data.
- Return type:
pandas.DataFrame
Notes
This function merges route, pipe, operating, and soil data to compute various scenario parameters. It calculates various attributes such as lognormal distributions, buckling forces, and section counts. The resulting DataFrame includes a subset of calculated columns and is filled with 0 for missing values.
- buckpy.buckpy_preprocessing.calc_soil_data(df, pipeline_set)[source]#
Calculate soil data and axial and lateral friction factor distributions and assign them to DataFrame columns.
- Parameters:
df (pandas.DataFrame) – DataFrame containing soil data.
pipeline_set (str) – Identifier of the pipeline set.
- Returns:
df – DataFrame containing soil data and calculated friction factor distributions.
- Return type:
pandas.DataFrame
Notes
This function filters df DataFrame based on pipeline_set value. It computes lognormal distributions for axial and lateral friction factors and assigns them to DataFrame columns.
- buckpy.buckpy_preprocessing.import_scenario(work_dir, file_name, pipeline_id, scenario_no, bl_verbose=False)[source]#
Import scenario data from an Excel file and preprocess it.
- Parameters:
work_dir (str) – Directory where the Excel file is located.
file_name (str) – Name of the Excel file.
pipeline_id (str) – Identifier of the pipeline.
scenario_no (int) – Identifier of the scenario.
- Returns:
df_scen (pandas.DataFrame) – Dataframe containing the scenario data
np_distr (numpy.ndarray) – Array containing the friction factor distributions
np_scen (numpy.ndarray) – Array containing the scenario data
np_ends (numpy.ndarray) – Array containing the end boundary conditions
df_pp (pandas.DataFrame) – Array containing the post-processing data
n_sim (int) – Number of simulations
Notes
This function reads scenario data from an Excel file and preprocesses it. It extracts layout, pipeline, and loadcase sets, and the number of simulations from the Excel file. Postprocesses route, pipe, operating, soil, and scenario data. Processes post-processing sets and defines the NumPy arrays for Monte Carlo Simulations.
- Parameters:
bl_verbose (boolean, optional) – True if intermediate printouts are required (False by default).