utils package

Submodules

utils.analyze module

class utils.analyze.ActionUnits(action_units, group_labels, observation_labels, feat_names, group_dict, scaler)[source]

Bases: object

save(save_path)[source]
scale()[source]
to_df()[source]
utils.analyze.BORIS_to_pose(config, verbose=True)[source]

Intake paired BORIS one-hot-encoded observation files and pose segmented files and align them to see what behaviors line up with what pose modules

Parameters:

config – the config

Returns:

class utils.analyze.KeypointFeature(keypoint_feature, group_labels, observation_labels, feat_names, group_dict, scaler, feature_type)[source]

Bases: object

save(save_path)[source]
scale()[source]
to_df()[source]
class utils.analyze.ModuleTransitions(transition_counts, transition_count_matrices, group_labels, observation_labels, feat_names, group_dict)[source]

Bases: object

save(save_path)[source]
scale()[source]
to_df()[source]
class utils.analyze.ModuleUsage(label_counts, group_labels, observation_labels, feat_names, group_dict, scaler)[source]

Bases: object

apply_picks(pick_names)[source]
collapse_timebins()[source]
f_oneway()[source]
save(save_path)[source]
scale()[source]
to_df()[source]
usage_density(convolve=False, window=5)[source]
utils.analyze.classify(module_feature_object, method='lda')[source]

Classify pose segmentation data using either module usage or transitions

Parameters:
  • module_feature_object – ModuleUsage or ModuleTransitions object

  • method – classification method to use; options include “lda”, “logisticregression”, “mlp”, “naivebayes”, “knn”, or “randomforest”

Returns:

classifier

utils.analyze.combine_pose_modules(config, labels_df, force_no_numeric=False)[source]

Combine pose modules based on remappings key in config

Parameters:
  • config – config

  • labels_df – from label_counter (subgroups or no_subgroups)

Returns:

labels_df_remapped

utils.analyze.ego_center(config, data, keypoint_ego1, keypoint_ego2)[source]

Ego centering a single pose estimation dataframe

Parameters:
  • config – config

  • data – pose estimation pandas dataframe in DLC format

  • keypoint_ego1 – first keypoint to use for establishing egocentric alignment axis

  • keypoint_ego2 – second keypoint to use for establishing egocentric alignment axis

Returns:

ego_centered pandas dataframe

utils.analyze.embed(module_feature_object, method='lda', n_components=2)[source]

Get dimensionally reduced space embeddings of the data with LDA or PCA

Parameters:
  • module_feature_object – module feature object

  • method – LDA or PCA; default LDA

  • n_components – number of components

Returns:

utils.analyze.get_action_units(config, start, end, binsize=None, selected_subgroups=None, aus_to_include='all')[source]

Function for getting mean action units :param config: project config :param start: start time (in s) :param end: end time (in s) :param binsize: optional - temporal bin size (default None for no binning) :param selected_subgroups: otpional - selected subgroups :param aus_to_include: list of integers corresponding to action units to include, or “all” for all; default “all” :return: ActionUnits

utils.analyze.get_distance(module_feature_object, method='euclidean')[source]

Get distance

Parameters:
  • module_feature_object

  • method

Returns:

utils.analyze.get_keypoint_kinematics(PE_config, keypoint_ego1, keypoint_ego2, start, end, binsize=None, metric='angle', thresh=70, selected_subgroups='all', return_as_df=True, verbose=False)[source]

Measure keypoint kinematics (egocentrically aligned angle, distance, and travel of keypoints) across group of subjects

Parameters:
  • PE_config – pose estimation config - used for FPS only

  • keypoint1 – first keypoint to use for establishing egocentric alignment axis

  • keypoint2 – second keypoint to use for establishing egocentric alignment axis

  • start – start time in seconds

  • end – end time in seconds

  • binsize – if binned output is desired, size of timebins (default is None for no binning)

  • metric – metric to assess for all non-keypoint1/2 keypoints; options are “angle_m” (for mean angle), “angle_sd” (for std deviation of angle), “distance_m” (for mean distance of keypoints to ego-center), “distance_sd” (for std deviation of distance of keypoints to ego-center), or “travel” (for movement of keypoints relative to ego-center)

  • thresh – threshold for when distance ‘jump’ is too large and should be excluded; default 70

  • selected_subgroups – subgroups to analyze as defined in config; default “all”

  • return_as_df – return as a dataframe or return as KeypointTravel class (for embedding, classification)

Returns:

utils.analyze.get_keypoint_travel(PE_config, keypoint, start, end, binsize=None, thresh=70, selected_subgroups='all', return_as_df=True)[source]

Measure keypodint distance travelled across group of subjects

Parameters:
  • PE_config – pose estimation config - used for FPS only

  • keypoint – name of keypoint

  • start – start time in seconds

  • end – end time in seconds

  • binsize – if binned output is desired, size of timebins (default is None for no binning)

  • thresh – threshold for when distance ‘jump’ is too large and should be excluded; default 70

  • selected_subgroups – subgroups to analyze as defined in config; default “all”

  • return_as_df – return as a dataframe or return as KeypointTravel class (for embedding, classification)

Returns:

utils.analyze.get_module_labels(config, start, stop, subgroups=None)[source]

Generates a Pandas dataframe containing the labels for every frame in the specified time range for the video paths in defined groups.

Parameters:
  • config – the config

  • start – time in seconds to start dataframe from.

  • stop – time in seconds to stop dataframe at.

  • fps – frames per second of recording.

  • subgroups – subgroups to include; by default, None will result in an object without data subgrouped; could alternatively be a list of subgroup names from config or “all” (to include all subgroups present in config)

Returns:

labels dataframe

utils.analyze.get_module_transitions(config, labels_df, modules_altered=False)[source]

Reshape labels dataframe from label_counter_subgroups to be an array of features

Parameters:
  • config – config object

  • labels_df – labels dataframe from label_counter_subgroups

  • binsize – width of bins in seconds; if None, no binning is performed

Returns:

utils.analyze.get_module_usage(config, labels_df, binsize=None, modules_altered=False)[source]

Reshape labels dataframe from label_counter_subgroups to be an array of features

Parameters:
  • config – config object

  • labels_df – labels dataframe from label_counter_subgroups

  • binsize – width of bins in seconds; if None, no binning is performed

  • modules_altered – must be true if modules have been remapped

Returns:

object of class ModuleUsage

utils.analyze.is_nonnum(value)[source]
utils.analyze.load_module_feature_object(module_feature_object_path)[source]
utils.analyze.loocv(module_feature_object, method='lda')[source]

Perform leave-one-out cross-validation for a method of classifying pose segmentation data using either module usage or transitions

Parameters:
  • module_feature_object – ModuleUsage or ModuleTransitions object

  • method – classification method to use; options include “lda”, “logisticregression”, “mlp”, “naivebayes”, “knn”, or “randomforest”

Returns:

accuracy, conf_mat

utils.analyze.loocv_regression(module_feature_object, dose_dict, method='LinearRegression', constrain_pos=True, degree=1, alpha=1)[source]

Perform LOOCV for linear regression

Parameters:
  • module_feature_object – ModuleUsage, ModuleTransitions, or KeypointFeature object

  • dose_dict – dictionary with keys corresponding to subgroups and items corresponding to variable

  • method – method of regression; either “LinearRegression” (default) or “Ridge” or “Lasso”

  • constrain_pos – constrain to only positive values; true by default

  • degree – degree of polynomial (default 1)

  • alpha – alpha (default 1; only applies for regularized regression, i.e. Ridge and Lasso)

Returns:

held-out predictions, squared error

utils.analyze.make_remappings_from_BORIS(config, labels_df=None, BORIS_to_pose_mat=None, force_no_numeric=False)[source]

Make remappings based on BORIS output and apply to labels_df

Parameters:
  • config – config

  • labels_df

  • BORIS_to_pose_mat – the non-normalized result matrix (first result option) from BORIS_to_pose

Returns:

utils.analyze.packed_moving_average(x, w)[source]
utils.analyze.pickle_dump(object, save_path)[source]
utils.analyze.pickle_load(object_path)[source]
utils.analyze.read_openface_csv(of_config, filepath)[source]
utils.analyze.read_sleap_csv(filepath, track_name)[source]
utils.analyze.regress(module_feature_object, dose_dict, method='LinearRegression', degree=1, alpha=1)[source]

Regress a continuous variable (e.g., dose, stimulus value) from a module or keypoint feature object

Parameters:
  • module_feature_object – ModuleUsage, ModuleTransitions, or KeypointFeature object

  • dose_dict – dictionary with keys corresponding to subgroups and items corresponding to variable

  • method – method of regression; either “LinearRegression” (default) or “Ridge” or “Lasso”

  • degree – polynomial degree; default 1

  • alpha – alpha (default 1; only applies for regularized regression, i.e. Ridge and Lasso)

Returns:

regression model, dose_labels

utils.metadata module

utils.metadata.create_PE_project(project_name, data_directory, data_source, output_directory, fps)[source]

Make project directory and write config_PE.yaml file.

Parameters:
  • project_name – what to call your creation

  • data_directory – path to source data

  • data_source – DeepLabCut or SLEAP

  • output_directory – path where output directory and config file should be created

  • fps – frames per second

Returns:

path to saved config (in specified output directory)

utils.metadata.create_PS_project(project_name, data_directory, data_source, output_directory, fps, n_modules)[source]

Make project directory and write config_PS.yaml file.

Parameters:
  • project_name – what to call your creation

  • data_directory – path to source data

  • data_source – B-SOiD, VAME, or Keypoint-MoSeq

  • output_directory – path where output directory and config file should be created

  • fps – frames per second

  • n_modules – number of pose states / modules

Returns:

path to saved config (in specified output directory)

utils.metadata.edit_config(config_path)[source]
utils.metadata.load_project(config_path)[source]

Loads project from config_PS.yaml or config_PE.yaml file

Parameters:

config_path – path to config file:

utils.metadata.save_edited_project(config, config_path)[source]

Loads project from config_PS.yaml or config_PE.yaml file

Parameters:

config_path – path to config file:

utils.plot module

utils.plot.BORIS_to_pose_matrix_plot(config, boris_to_pose_output, figW=4, figH=2.5, cmap='Greens', outline_top_match=True)[source]
utils.plot.fig_to_array(fig)[source]
utils.plot.is_nonnum(value)[source]
utils.plot.make_and_plot_ellipse(mean, cov, color, label=None)[source]
utils.plot.module_usage_sandplot(config, module_usage, remap=False, title=None, legend=True, long_legend=True, figW=7, figH=3, convolve=False, window=5)[source]

new sandplot function

Parameters:
  • config – the config object

  • module_usage

  • BORIS_to_pose_mat – optional BORIS_to_pose_mat from analyze.boris_to_pose to re-align modules by their most overlapping manually scored behavior class

  • title

  • legend – plot legend or not

  • long_legend

  • convolve

  • window

Returns:

utils.plot.moving_average(x, w)[source]
utils.plot.network_plot(config, labels_df=None, module_usage=None, module_transitions=None, cmap='bwr', include_labels=True, scaling=1, tscale=6, figW=2.8, figH=2.5, alt_labels=None)[source]

Plot network comparison. You must provide EITHER labels_df OR module_usage and module_tsransitions

Parameters:
  • config – project config object

  • labels_df – labels_df for two groups to be compared; if provided, ModuleUsage and ModuleTransitions will be computed

  • module_usage – ModuleUsage object for comparison between two groups (not needed if labels_df is provided)

  • module_transitions – ModuleTransitions object for comparison between two groups (not needed if labels_df is provided)

  • cmap – color map

  • include_labels – True or False; default True

  • scaling – controls size of nodes in network plot; larger –> bigger; default 1

  • tscale – vmax and vmin for tscore; default 6

  • figW – figure width

  • figH – figure height

  • alt_labels – possible alt labels

Returns:

utils.plot.plot_action_units(config, action_units, figW=4, figH=2, style='bar_scatter', cmap='jet', legend_pos='outside_right', legend=True, alt_labels=None, alt_xticks=None, title=None, plot_stats=False)[source]

Plot mean action units

Parameters:
  • config – config

  • action_units – output of analyze.get_action_units

  • figW – figure width (default: 4)

  • figH – figure height (default: 2)

  • style – plot style; “bar_scatter”, “bar_error”, “points”

  • cmap – colormap (default: jet)

  • legend_pos – legend position (default: outside)

  • legend – boolean to include or not include legend (default: False)

  • alt_labels – alternative group label dictionary (default: None)

  • alt_xticks – alternative xticklabels (list), for stacked plot style only (default: None)

  • title – plot title (default: None)

  • plot_stats – include stats on plot (default: False)

Returns:

fig

utils.plot.plot_distance_box(module_feature_object, dist_mat, cmap='Blues', figW=3, figH=3, alt_labels=None, title=None)[source]

Plot distance boxplot

Parameters:
  • module_feature_object

  • dist_mat

  • cmap

  • figW

  • figH

  • alt_labels

  • title

Returns:

utils.plot.plot_distance_matrix(module_feature_object, dist_mat, cmap='Greens', figW=3, figH=3, alt_labels=None, title=None)[source]

Plot distance matrix

Parameters:
  • module_feature_object

  • dist_mat

  • cmap

  • figW

  • figH

  • alt_labels

  • title

Returns:

utils.plot.plot_distance_results(module_feature_object, distance_results, figW=3, figH=3, cmap='Blues', title=None)[source]

Plot results from distance computation

Parameters:
  • module_feature_object

  • distance_results

  • figW

  • figH

  • cmap

  • title

Returns:

utils.plot.plot_embeddings(module_feature_object, embeddings_object, figW=3, figH=3, cmap='viridis', title=None, legend=False, draw_ellipse=True, alt_legend=None)[source]

Plot embeddings

Parameters:
  • module_feature_object – module feature object (ModuleUsage or ModuleTransitions) from analyze.get_module_{xx}

  • embeddings_object – embeddings object (LDA or PCA) from analyze.embed

  • figW – figure width

  • figH – figure height

  • cmap – matplotlib colormap

  • title – title string, or None

  • legend – True or False

Returns:

fig

utils.plot.plot_keypoint_kinematics(config, kinematics, figW=4, figH=2, style='bar_scatter', cmap='jet', legend_pos='outside_right', legend=True, alt_labels=None, alt_xticks=None, title=None, plot_stats=False)[source]

Plot mean action units

Parameters:
  • config – config

  • action_units – output of analyze.get_keypoint_kinematics

  • figW – figure width (default: 4)

  • figH – figure height (default: 2)

  • style – plot style; “bar_scatter”, “bar_error”, “points”

  • cmap – colormap (default: jet)

  • legend_pos – legend position (default: outside)

  • legend – boolean to include or not include legend (default: False)

  • alt_labels – alternative group label dictionary (default: None)

  • alt_xticks – alternative xticklabels (list), for stacked plot style only (default: None)

  • title – plot title (default: None)

  • plot_stats – include stats on plot (default: False)

Returns:

fig

utils.plot.plot_keypoint_travel(keypoint_feature, cmap='viridis', plottype='band', figW=6, figH=3)[source]

Plots displacement of a keypoint either over time or in bins from dist_df (output of analyze.dist_df_subgroups)

Parameters:
  • dist_df – dist_df output from analyze.dist_df_subgroups

  • cmap – matplotlib colormap

  • plottype – type of plot (“band”, “errorbar”, or “bar” if no timebins)

  • figW – figure width

  • figH – figure height

Returns:

utils.plot.plot_module_usage(config, usage_feats, figW=4, figH=2, style='bar_scatter', cmap='jet', legend_pos='outside_right', remap=False, legend=True, long_legend=False, alt_labels=None, alt_xticks=None, title=None, plot_stats=False)[source]

Plot module usage

Parameters:
  • config – config

  • usage_feats – output of analyze.get_module_usage

  • figW – figure width (default: 4)

  • figH – figure height (default: 2)

  • style – plot style; “bar_scatter”, “bar_error”, “points”, or “stacked”

  • cmap – colormap (default: jet)

  • legend_pos – legend position (default: outside)

  • remap – whether to remap modules acording to config[“remappings”] (default: False)

  • legend – boolean to include or not include legend (default: False)

  • long_legend – long legend (default: False)

  • alt_labels – alternative group label dictionary (default: None)

  • alt_xticks – alternative xticklabels (list), for stacked plot style only (default: None)

  • title – plot title (default: None)

  • plot_stats – include stats on plot (default: False)

Returns:

fig

utils.simulate module

utils.simulate.generate_sequence(config, labels_df, T, n_subjs=1, random_state=42, verbose=False)[source]

Generate individual simulated pose module label sequence of length T (noting that T = number of observations, not time)

Parameters:
  • config – the config object

  • labels_df – labels_df from analyze.get_module_labels

  • T – length of sequence to be generated

  • n_subjs – number of subjects to generate (if labels_df has subgroups, will be number of subjects PER GROUP)

  • random_state – random state seed (default: 42)

  • verbose – print progress (default: False)

Returns:

sequence

utils.simulate.generate_usage(module_feature_object, n_samples, random_state=42, mode='log-normal', scale=10)[source]

Generate simulated pose module usage object

Parameters:
  • module_feature_object – module feature object of class ModuleUsage (from analyze.get_module_usage) or ModuleTransitions (from analyze.get_module_transitions)

  • n_samples – number of samples to generate

  • random_state – random state seed (default: 42)

  • mode – ‘log-normal’ or ‘multivariate_gaussian’; default log-normal

  • scale – scale factor for variance in log-normal mode (not needed for multivariate_gaussian mode)

Returns:

module_feature_object of the same style as the one input

utils.simulate.generate_usage_labeled(module_feature_object, n_samples_per_bin, bins, regression, max_iters='default', random_state=42, mode='log-normal', scale=10, verbosity='medium')[source]

Generate simulated pose module usage object

Parameters:
  • module_feature_object – module feature object of class ModuleUsage (from analyze.get_module_usage) or ModuleTransitions (from analyze.get_module_transitions)

  • n_samples_per_bin – number of samples to generate per bin in timebin

  • bins – 2D array of bins containing upper and lower limit for each bin

  • regression – regression model to label samples

  • max_iters – number for how many iterations to attempt to generate samples; will raise an error if exceeded without generating enough samples; number or ‘default’ for n_samples_total*10

  • random_state – random state seed (default: 42)

  • mode – ‘log-normal’ or ‘multivariate_gaussian’; default log-normal

  • scale – scaling factor for covariance in log-transformed approach

  • verbosity – ‘low’,’medium’, or ‘high’

Returns:

module_feature_object of the same style as the one input

Module contents