utils package
Submodules
utils.analyze module
- class utils.analyze.ActionUnits(action_units, group_labels, observation_labels, feat_names, group_dict, scaler)[source]
Bases:
object
- utils.analyze.BORIS_to_pose(config, verbose=True)[source]
Intake paired BORIS one-hot-encoded observation files and pose segmented files and align them to see what behaviors line up with what pose modules
- Parameters:
config – the config
- Returns:
- class utils.analyze.KeypointFeature(keypoint_feature, group_labels, observation_labels, feat_names, group_dict, scaler, feature_type)[source]
Bases:
object
- class utils.analyze.ModuleTransitions(transition_counts, transition_count_matrices, group_labels, observation_labels, feat_names, group_dict)[source]
Bases:
object
- class utils.analyze.ModuleUsage(label_counts, group_labels, observation_labels, feat_names, group_dict, scaler)[source]
Bases:
object
- utils.analyze.classify(module_feature_object, method='lda')[source]
Classify pose segmentation data using either module usage or transitions
- Parameters:
module_feature_object – ModuleUsage or ModuleTransitions object
method – classification method to use; options include “lda”, “logisticregression”, “mlp”, “naivebayes”, “knn”, or “randomforest”
- Returns:
classifier
- utils.analyze.combine_pose_modules(config, labels_df, force_no_numeric=False)[source]
Combine pose modules based on remappings key in config
- Parameters:
config – config
labels_df – from label_counter (subgroups or no_subgroups)
- Returns:
labels_df_remapped
- utils.analyze.ego_center(config, data, keypoint_ego1, keypoint_ego2)[source]
Ego centering a single pose estimation dataframe
- Parameters:
config – config
data – pose estimation pandas dataframe in DLC format
keypoint_ego1 – first keypoint to use for establishing egocentric alignment axis
keypoint_ego2 – second keypoint to use for establishing egocentric alignment axis
- Returns:
ego_centered pandas dataframe
- utils.analyze.embed(module_feature_object, method='lda', n_components=2)[source]
Get dimensionally reduced space embeddings of the data with LDA or PCA
- Parameters:
module_feature_object – module feature object
method – LDA or PCA; default LDA
n_components – number of components
- Returns:
- utils.analyze.get_action_units(config, start, end, binsize=None, selected_subgroups=None, aus_to_include='all')[source]
Function for getting mean action units :param config: project config :param start: start time (in s) :param end: end time (in s) :param binsize: optional - temporal bin size (default None for no binning) :param selected_subgroups: otpional - selected subgroups :param aus_to_include: list of integers corresponding to action units to include, or “all” for all; default “all” :return: ActionUnits
- utils.analyze.get_distance(module_feature_object, method='euclidean')[source]
Get distance
- Parameters:
module_feature_object –
method –
- Returns:
- utils.analyze.get_keypoint_kinematics(PE_config, keypoint_ego1, keypoint_ego2, start, end, binsize=None, metric='angle', thresh=70, selected_subgroups='all', return_as_df=True, verbose=False)[source]
Measure keypoint kinematics (egocentrically aligned angle, distance, and travel of keypoints) across group of subjects
- Parameters:
PE_config – pose estimation config - used for FPS only
keypoint1 – first keypoint to use for establishing egocentric alignment axis
keypoint2 – second keypoint to use for establishing egocentric alignment axis
start – start time in seconds
end – end time in seconds
binsize – if binned output is desired, size of timebins (default is None for no binning)
metric – metric to assess for all non-keypoint1/2 keypoints; options are “angle_m” (for mean angle), “angle_sd” (for std deviation of angle), “distance_m” (for mean distance of keypoints to ego-center), “distance_sd” (for std deviation of distance of keypoints to ego-center), or “travel” (for movement of keypoints relative to ego-center)
thresh – threshold for when distance ‘jump’ is too large and should be excluded; default 70
selected_subgroups – subgroups to analyze as defined in config; default “all”
return_as_df – return as a dataframe or return as KeypointTravel class (for embedding, classification)
- Returns:
- utils.analyze.get_keypoint_travel(PE_config, keypoint, start, end, binsize=None, thresh=70, selected_subgroups='all', return_as_df=True)[source]
Measure keypodint distance travelled across group of subjects
- Parameters:
PE_config – pose estimation config - used for FPS only
keypoint – name of keypoint
start – start time in seconds
end – end time in seconds
binsize – if binned output is desired, size of timebins (default is None for no binning)
thresh – threshold for when distance ‘jump’ is too large and should be excluded; default 70
selected_subgroups – subgroups to analyze as defined in config; default “all”
return_as_df – return as a dataframe or return as KeypointTravel class (for embedding, classification)
- Returns:
- utils.analyze.get_module_labels(config, start, stop, subgroups=None)[source]
Generates a Pandas dataframe containing the labels for every frame in the specified time range for the video paths in defined groups.
- Parameters:
config – the config
start – time in seconds to start dataframe from.
stop – time in seconds to stop dataframe at.
fps – frames per second of recording.
subgroups – subgroups to include; by default, None will result in an object without data subgrouped; could alternatively be a list of subgroup names from config or “all” (to include all subgroups present in config)
- Returns:
labels dataframe
- utils.analyze.get_module_transitions(config, labels_df, modules_altered=False)[source]
Reshape labels dataframe from label_counter_subgroups to be an array of features
- Parameters:
config – config object
labels_df – labels dataframe from label_counter_subgroups
binsize – width of bins in seconds; if None, no binning is performed
- Returns:
- utils.analyze.get_module_usage(config, labels_df, binsize=None, modules_altered=False)[source]
Reshape labels dataframe from label_counter_subgroups to be an array of features
- Parameters:
config – config object
labels_df – labels dataframe from label_counter_subgroups
binsize – width of bins in seconds; if None, no binning is performed
modules_altered – must be true if modules have been remapped
- Returns:
object of class ModuleUsage
- utils.analyze.loocv(module_feature_object, method='lda')[source]
Perform leave-one-out cross-validation for a method of classifying pose segmentation data using either module usage or transitions
- Parameters:
module_feature_object – ModuleUsage or ModuleTransitions object
method – classification method to use; options include “lda”, “logisticregression”, “mlp”, “naivebayes”, “knn”, or “randomforest”
- Returns:
accuracy, conf_mat
- utils.analyze.loocv_regression(module_feature_object, dose_dict, method='LinearRegression', constrain_pos=True, degree=1, alpha=1)[source]
Perform LOOCV for linear regression
- Parameters:
module_feature_object – ModuleUsage, ModuleTransitions, or KeypointFeature object
dose_dict – dictionary with keys corresponding to subgroups and items corresponding to variable
method – method of regression; either “LinearRegression” (default) or “Ridge” or “Lasso”
constrain_pos – constrain to only positive values; true by default
degree – degree of polynomial (default 1)
alpha – alpha (default 1; only applies for regularized regression, i.e. Ridge and Lasso)
- Returns:
held-out predictions, squared error
- utils.analyze.make_remappings_from_BORIS(config, labels_df=None, BORIS_to_pose_mat=None, force_no_numeric=False)[source]
Make remappings based on BORIS output and apply to labels_df
- Parameters:
config – config
labels_df –
BORIS_to_pose_mat – the non-normalized result matrix (first result option) from BORIS_to_pose
- Returns:
- utils.analyze.regress(module_feature_object, dose_dict, method='LinearRegression', degree=1, alpha=1)[source]
Regress a continuous variable (e.g., dose, stimulus value) from a module or keypoint feature object
- Parameters:
module_feature_object – ModuleUsage, ModuleTransitions, or KeypointFeature object
dose_dict – dictionary with keys corresponding to subgroups and items corresponding to variable
method – method of regression; either “LinearRegression” (default) or “Ridge” or “Lasso”
degree – polynomial degree; default 1
alpha – alpha (default 1; only applies for regularized regression, i.e. Ridge and Lasso)
- Returns:
regression model, dose_labels
utils.metadata module
- utils.metadata.create_PE_project(project_name, data_directory, data_source, output_directory, fps)[source]
Make project directory and write config_PE.yaml file.
- Parameters:
project_name – what to call your creation
data_directory – path to source data
data_source – DeepLabCut or SLEAP
output_directory – path where output directory and config file should be created
fps – frames per second
- Returns:
path to saved config (in specified output directory)
- utils.metadata.create_PS_project(project_name, data_directory, data_source, output_directory, fps, n_modules)[source]
Make project directory and write config_PS.yaml file.
- Parameters:
project_name – what to call your creation
data_directory – path to source data
data_source – B-SOiD, VAME, or Keypoint-MoSeq
output_directory – path where output directory and config file should be created
fps – frames per second
n_modules – number of pose states / modules
- Returns:
path to saved config (in specified output directory)
utils.plot module
- utils.plot.BORIS_to_pose_matrix_plot(config, boris_to_pose_output, figW=4, figH=2.5, cmap='Greens', outline_top_match=True)[source]
- utils.plot.module_usage_sandplot(config, module_usage, remap=False, title=None, legend=True, long_legend=True, figW=7, figH=3, convolve=False, window=5)[source]
new sandplot function
- Parameters:
config – the config object
module_usage –
BORIS_to_pose_mat – optional BORIS_to_pose_mat from analyze.boris_to_pose to re-align modules by their most overlapping manually scored behavior class
title –
legend – plot legend or not
long_legend –
convolve –
window –
- Returns:
- utils.plot.network_plot(config, labels_df=None, module_usage=None, module_transitions=None, cmap='bwr', include_labels=True, scaling=1, tscale=6, figW=2.8, figH=2.5, alt_labels=None)[source]
Plot network comparison. You must provide EITHER labels_df OR module_usage and module_tsransitions
- Parameters:
config – project config object
labels_df – labels_df for two groups to be compared; if provided, ModuleUsage and ModuleTransitions will be computed
module_usage – ModuleUsage object for comparison between two groups (not needed if labels_df is provided)
module_transitions – ModuleTransitions object for comparison between two groups (not needed if labels_df is provided)
cmap – color map
include_labels – True or False; default True
scaling – controls size of nodes in network plot; larger –> bigger; default 1
tscale – vmax and vmin for tscore; default 6
figW – figure width
figH – figure height
alt_labels – possible alt labels
- Returns:
- utils.plot.plot_action_units(config, action_units, figW=4, figH=2, style='bar_scatter', cmap='jet', legend_pos='outside_right', legend=True, alt_labels=None, alt_xticks=None, title=None, plot_stats=False)[source]
Plot mean action units
- Parameters:
config – config
action_units – output of analyze.get_action_units
figW – figure width (default: 4)
figH – figure height (default: 2)
style – plot style; “bar_scatter”, “bar_error”, “points”
cmap – colormap (default: jet)
legend_pos – legend position (default: outside)
legend – boolean to include or not include legend (default: False)
alt_labels – alternative group label dictionary (default: None)
alt_xticks – alternative xticklabels (list), for stacked plot style only (default: None)
title – plot title (default: None)
plot_stats – include stats on plot (default: False)
- Returns:
fig
- utils.plot.plot_distance_box(module_feature_object, dist_mat, cmap='Blues', figW=3, figH=3, alt_labels=None, title=None)[source]
Plot distance boxplot
- Parameters:
module_feature_object –
dist_mat –
cmap –
figW –
figH –
alt_labels –
title –
- Returns:
- utils.plot.plot_distance_matrix(module_feature_object, dist_mat, cmap='Greens', figW=3, figH=3, alt_labels=None, title=None)[source]
Plot distance matrix
- Parameters:
module_feature_object –
dist_mat –
cmap –
figW –
figH –
alt_labels –
title –
- Returns:
- utils.plot.plot_distance_results(module_feature_object, distance_results, figW=3, figH=3, cmap='Blues', title=None)[source]
Plot results from distance computation
- Parameters:
module_feature_object –
distance_results –
figW –
figH –
cmap –
title –
- Returns:
- utils.plot.plot_embeddings(module_feature_object, embeddings_object, figW=3, figH=3, cmap='viridis', title=None, legend=False, draw_ellipse=True, alt_legend=None)[source]
Plot embeddings
- Parameters:
module_feature_object – module feature object (ModuleUsage or ModuleTransitions) from analyze.get_module_{xx}
embeddings_object – embeddings object (LDA or PCA) from analyze.embed
figW – figure width
figH – figure height
cmap – matplotlib colormap
title – title string, or None
legend – True or False
- Returns:
fig
- utils.plot.plot_keypoint_kinematics(config, kinematics, figW=4, figH=2, style='bar_scatter', cmap='jet', legend_pos='outside_right', legend=True, alt_labels=None, alt_xticks=None, title=None, plot_stats=False)[source]
Plot mean action units
- Parameters:
config – config
action_units – output of analyze.get_keypoint_kinematics
figW – figure width (default: 4)
figH – figure height (default: 2)
style – plot style; “bar_scatter”, “bar_error”, “points”
cmap – colormap (default: jet)
legend_pos – legend position (default: outside)
legend – boolean to include or not include legend (default: False)
alt_labels – alternative group label dictionary (default: None)
alt_xticks – alternative xticklabels (list), for stacked plot style only (default: None)
title – plot title (default: None)
plot_stats – include stats on plot (default: False)
- Returns:
fig
- utils.plot.plot_keypoint_travel(keypoint_feature, cmap='viridis', plottype='band', figW=6, figH=3)[source]
Plots displacement of a keypoint either over time or in bins from dist_df (output of analyze.dist_df_subgroups)
- Parameters:
dist_df – dist_df output from analyze.dist_df_subgroups
cmap – matplotlib colormap
plottype – type of plot (“band”, “errorbar”, or “bar” if no timebins)
figW – figure width
figH – figure height
- Returns:
- utils.plot.plot_module_usage(config, usage_feats, figW=4, figH=2, style='bar_scatter', cmap='jet', legend_pos='outside_right', remap=False, legend=True, long_legend=False, alt_labels=None, alt_xticks=None, title=None, plot_stats=False)[source]
Plot module usage
- Parameters:
config – config
usage_feats – output of analyze.get_module_usage
figW – figure width (default: 4)
figH – figure height (default: 2)
style – plot style; “bar_scatter”, “bar_error”, “points”, or “stacked”
cmap – colormap (default: jet)
legend_pos – legend position (default: outside)
remap – whether to remap modules acording to config[“remappings”] (default: False)
legend – boolean to include or not include legend (default: False)
long_legend – long legend (default: False)
alt_labels – alternative group label dictionary (default: None)
alt_xticks – alternative xticklabels (list), for stacked plot style only (default: None)
title – plot title (default: None)
plot_stats – include stats on plot (default: False)
- Returns:
fig
utils.simulate module
- utils.simulate.generate_sequence(config, labels_df, T, n_subjs=1, random_state=42, verbose=False)[source]
Generate individual simulated pose module label sequence of length T (noting that T = number of observations, not time)
- Parameters:
config – the config object
labels_df – labels_df from analyze.get_module_labels
T – length of sequence to be generated
n_subjs – number of subjects to generate (if labels_df has subgroups, will be number of subjects PER GROUP)
random_state – random state seed (default: 42)
verbose – print progress (default: False)
- Returns:
sequence
- utils.simulate.generate_usage(module_feature_object, n_samples, random_state=42, mode='log-normal', scale=10)[source]
Generate simulated pose module usage object
- Parameters:
module_feature_object – module feature object of class ModuleUsage (from analyze.get_module_usage) or ModuleTransitions (from analyze.get_module_transitions)
n_samples – number of samples to generate
random_state – random state seed (default: 42)
mode – ‘log-normal’ or ‘multivariate_gaussian’; default log-normal
scale – scale factor for variance in log-normal mode (not needed for multivariate_gaussian mode)
- Returns:
module_feature_object of the same style as the one input
- utils.simulate.generate_usage_labeled(module_feature_object, n_samples_per_bin, bins, regression, max_iters='default', random_state=42, mode='log-normal', scale=10, verbosity='medium')[source]
Generate simulated pose module usage object
- Parameters:
module_feature_object – module feature object of class ModuleUsage (from analyze.get_module_usage) or ModuleTransitions (from analyze.get_module_transitions)
n_samples_per_bin – number of samples to generate per bin in timebin
bins – 2D array of bins containing upper and lower limit for each bin
regression – regression model to label samples
max_iters – number for how many iterations to attempt to generate samples; will raise an error if exceeded without generating enough samples; number or ‘default’ for n_samples_total*10
random_state – random state seed (default: 42)
mode – ‘log-normal’ or ‘multivariate_gaussian’; default log-normal
scale – scaling factor for covariance in log-transformed approach
verbosity – ‘low’,’medium’, or ‘high’
- Returns:
module_feature_object of the same style as the one input