Skip to content

neptoon.utils

general_utils

Functions:

validate_and_convert_file_path

validate_and_convert_file_path(file_path, base=None)

Ensures that file paths are correctly parsed into pathlib.Path objects.

Parameters:

Name Type Description Default
file_path str | Path | None

The path to the folder or file.

required
base str | Path | None

Base to add to file (e..g, a custom base directory)

None

Returns:

Type Description
Path | None

The file_path as a pathlib.Path object.

Raises:

Type Description
ValueError

Error if string, pathlib.Path, or None not given.

timedelta_to_freq_str

timedelta_to_freq_str(time_delta)

Convert a timedelta to a pandas frequency string.

validate_timestamp_index

validate_timestamp_index(data_frame)

Checks that the index of the dataframe is timestamp (essential for aligning the time stamps and using SaQC)

Parameters:

Name Type Description Default
data_frame DataFrame

The data frame imported into the TimeStampAligner

required

Raises:

Type Description
ValueError

If the index is not datetime type

parse_resolution_to_timedelta

parse_resolution_to_timedelta(resolution_str)

Parse a string representation of a time resolution and convert it to a timedelta object.

This method takes a string describing a time resolution (e.g., "30 minutes", "2 hours", "1 day") and converts it into a Python timedelta object. It supports minutes, hours, and days as units.

Parameters:

Name Type Description Default
resolution_str str

A string representing the time resolution. The format should be " ", where is a positive integer and is one of the following: - For minutes: "min", "minute", "minutes" - For hours: "hour", "hours", "hr", "hrs" - For days: "day", "days" The parsing is case-insensitive.

required

Returns:

Type Description
timedelta

A timedelta object representing the parsed time resolution.

Raises:

Type Description
ValueError

If the resolution string format is invalid or cannot be parsed.

ValueError

If an unsupported time unit is provided.

validate_df

validate_df(df, schema)

Validates a df against a pandera.pandas DataFrameSchema

NOTES: Keep it lazy to give info of all df issues

Parameters:

Name Type Description Default
df DataFrame

Dataframe to validate

required
schema DataFrameSchema

Pandera Schema to check against

required

is_resolution_greater_than

is_resolution_greater_than(resolution_a, resolution_b)

Returns True if resolution_a is greater (coarser) than resolution_b

Parameters:

Name Type Description Default
resolution_a str | timedelta

First resolution to compare

required
resolution_b str | timedelta

Second resolution to compare

required

Returns:

Type Description
bool

True if resolution_a > resolution_b

Note

If resolution is str then it should be in a form such as "1h", "6hour" or "1day". It will be auto converted internally to the func

recalculate_neutron_uncertainty

recalculate_neutron_uncertainty(data_frame, temporal_scaling_factor, uncertainty_col_name=None)

Adjust the staistical uncertainty of neutrons value based on the aggregation.

Parameters:

Name Type Description Default
data_frame DataFrame

DataFrame with data

required
temporal_scaling_factor int | float

The scaling factor to adjust from original resolution to revised output resolution

required
uncertainty_col_name str | None

Name of the col, if None will use the default supplied in ColumnInfo.Name.CORRECTED_EPI_NEUTRON_COUNT_UNCERTAINTY, by default None

None

Returns:

Type Description
_type_

description