CACHET_CADB¶
- class torch_ecg.databases.CACHET_CADB(db_dir: Optional[Union[str, pathlib.Path]] = None, working_dir: Optional[Union[str, pathlib.Path]] = None, verbose: int = 1, **kwargs: Any)[source]¶
Bases:
torch_ecg.databases.base._DataBase
CACHET-CADB: A Contextualized Ambulatory Electrocardiography Arrhythmia Dataset
ABOUT
The database has 259 days of contextualized ECG recordings from 24 patients and 1,602 manually annotated 10 s heart-rhythm samples.
The length of the ECG records in the CACHET-CADB varies from 24 h to 3 weeks.
The patient’s ambulatory context information (activities, movement acceleration, body position, etc.) is extracted for every 10 s interval cumulatively.
nearly 11% of the ECG data in the database is found to be noisy.
Webpages for downloading the database 1 and the short-format database 2, see also the GitHub repository 3.
Usage
ECG arrhythmia detection
Self-Supervised Learning
References
Citation
10.3389/fcvm.2022.893090 10.11583/DTU.14547264 10.11583/DTU.14547330
- Parameters
db_dir (str or pathlib.Path, optional) – Storage path of the database. If not specified, data will be fetched from Physionet.
working_dir (str, optional) – Working directory, to store intermediate files and log files.
verbose (int, default 1) – Level of logging verbosity.
kwargs (dict, optional) – Auxilliary key word arguments
- property database_info: torch_ecg.databases.base.DataBaseInfo¶
The
DataBaseInfo
object of the database.
- property df_metadata: pandas.core.frame.DataFrame¶
The table of metadata of the records.
- download(files: Optional[Union[str, Sequence[str]]]) None [source]¶
Download the database from the DTU website.
- get_absolute_path(rec: Union[str, int], extension: str = 'signal-ecg') pathlib.Path [source]¶
Get the absolute path of the signal folder of the record.
- Parameters
- Returns
Absolute path of the file.
- Return type
- get_subject_info(rec_or_sid: Union[str, int], items: Optional[List[str]] = None) Dict[str, str] [source]¶
Read auxiliary information of a subject (a record) stored in the header files.
- Parameters
- Returns
subject_info – Information about the subject, including “age”, “gender”, “height”, “weight”.
- Return type
- load_ann(rec: Union[str, int], ann_format: str = 'pd') Union[pandas.core.frame.DataFrame, numpy.ndarray, Dict[Union[int, str], numpy.ndarray]] [source]¶
Load annotation from the metadata file.
- Parameters
- Returns
ann – The annotation of the record.
- Return type
pandas.DataFrame or numpy.ndarray or dict
- load_context_ann(rec: Union[str, int], sheet_name: Optional[str] = None) Union[pandas.core.frame.DataFrame, Dict[str, pandas.core.frame.DataFrame]] [source]¶
Load context annotation.
- Parameters
- Returns
context_ann – Context annotations of the record.
- Return type
pandas.DataFrame or dict
- load_context_data(rec: Union[str, int], context_name: str, sampfrom: Optional[int] = None, sampto: Optional[int] = None, channels: Optional[Union[str, int, List[str], List[int]]] = None, units: Optional[str] = None, fs: Optional[numbers.Real] = None) Union[numpy.ndarray, pandas.core.frame.DataFrame] [source]¶
Load context data (e.g. accelerometer, heart rate, etc.).
- Parameters
rec (str or int) – Record name or index of the record in
all_records
.context_name (str) – Context name, can be one of “acc”, “angularrate”, “hr_live”, “hrvrmssd_live”, “movementacceleration_live”, “press”, “marker”.
sampfrom (int, optional) – Start index of the data to be loaded.
sampto (int, optional) – End index of the data to be loaded.
channels (str or int or List[str] or List[int], optional) – Channels (names or indices) to be loaded. If is None, all channels will be loaded.
units (str, optional) – Units of the output signal, currently can only be “default”; None for digital data, without digital-to-physical conversion.
fs (numbers.Real, optional) – Sampling frequency of the output signal. If not None, the loaded data will be resampled to this frequency, otherwise, the original sampling frequency will be used.
- Returns
context_data – Context data in the “channel_first” format.
- Return type
numpy.ndarray or pandas.DataFrame
Note
If the record does not have the specified context data, empty array or DataFrame will be returned.
- load_data(rec: Union[str, int], sampfrom: Optional[int] = None, sampto: Optional[int] = None, data_format: str = 'channel_first', units: Optional[str] = 'mV', fs: Optional[numbers.Real] = None, return_fs: bool = False) Union[numpy.ndarray, Tuple[numpy.ndarray, numbers.Real]] [source]¶
Load physical (converted from digital) ECG data, or load digital signal directly.
- Parameters
rec (str or int) – Record name or index of the record in
all_records
, or “short_format” (-1) to load data from the short format file.sampfrom (int, optional) – Start index of the data to be loaded.
sampto (int, optional) – End index of the data to be loaded.
data_format (str, default "channel_first") – Format of the ECG data, “channel_last” (alias “lead_last”), or “channel_first” (alias “lead_first”), or “flat” (alias “plain”).
units (str or None, default "mV") – Units of the output signal, can also be “μV” (aliases “uV”, “muV”); None for digital data, without digital-to-physical conversion.
fs (numbers.Real, optional) – Sampling frequency of the output signal. If not None, the loaded data will be resampled to this frequency, otherwise, the original sampling frequency will be used.
return_fs (bool, default False) – Whether to return the sampling frequency of the output signal.
- Returns
data (numpy.ndarray) – The loaded ECG data.
data_fs (numbers.Real, optional) – Sampling frequency of the output signal. Returned if return_fs is True.