Preprocessor¶
Implements modular components for dataset preprocessing: a data-trimmer, a standardizer, a feature selector and a sliding window data generator.
Contents¶
License¶
The MIT License (MIT)
Copyright (c) 2020 Harvey Bastidas
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Contributors¶
- Harvey Bastidas <harveybc@ingeni-us.com>
preprocessor¶
preprocessor package¶
Subpackages¶
preprocessor.data_trimmer package¶
This File contains the DataTrimmer class. To run this script uncomment or add the following lines in the [options.entry_points] section in setup.cfg:
- console_scripts =
- data-trimmer = data_trimmer.__main__:main
Then run python setup.py install which will install the command data-trimmer inside your current environment.
-
class
preprocessor.data_trimmer.data_trimmer.
DataTrimmer
(conf)[source]¶ Bases:
preprocessor.preprocessor.Preprocessor
The Data Trimmer preprocessor class
-
core
()[source]¶ - Core preprocessor task after starting the instance with the main method.
- Decide from the arguments, what trimming method to call.
Args: args (obj): command line parameters as objects
-
parse_args
(args)[source]¶ Parse command line parameters
Parameters: args ([str]) – command line parameters as list of strings Returns: command line parameters namespace Return type: argparse.Namespace
-
trim_auto
()[source]¶ Trims all the constant columns and trims all rows with consecutive zeroes from start and end of the input dataset
Returns: rows_t, cols_t (int,int): number of rows and columns trimmed
-
trim_columns
()[source]¶ Trims all the constant columns from the input dataset
Returns: number of rows and columns trimmed Return type: rows_t, cols_t (int,int)
-
preprocessor.feature_selector package¶
This File contains the FeatureSelector class. To run this script uncomment or add the following lines in the [options.entry_points] section in setup.cfg:
- console_scripts =
- feature_selector = feature_selector.__main__:main
Then run python setup.py install which will install the command feature_selector inside your current environment.
-
class
preprocessor.feature_selector.feature_selector.
FeatureSelector
(conf)[source]¶ Bases:
preprocessor.preprocessor.Preprocessor
The FeatureSelector preprocessor class
-
core
()[source]¶ - Core preprocessor task after starting the instance with the main method.
- Decide from the arguments, what method to call.
Args: args (obj): command line parameters as objects
-
parse_args
(args)[source]¶ Parse command line parameters
Parameters: args ([str]) – command line parameters as list of strings Returns: command line parameters namespace Return type: argparse.Namespace
-
preprocessor.sliding_window package¶
This File contains the SlidingWindow class. To run this script uncomment or add the following lines in the [options.entry_points] section in setup.cfg:
- console_scripts =
- sliding_window = sliding_window.__main__:main
Then run python setup.py install which will install the command sliding_window inside your current environment.
-
class
preprocessor.sliding_window.sliding_window.
SlidingWindow
(conf)[source]¶ Bases:
preprocessor.preprocessor.Preprocessor
The SlidingWindow preprocessor class
-
core
()[source]¶ - Core preprocessor task after starting the instance with the main method.
- Decide from the arguments, what method to call.
Args: args (obj): command line parameters as objects
-
parse_args
(args)[source]¶ Parse command line parameters additional to the preprocessor class ones
Parameters: args ([str]) – command line parameters as list of strings Returns: command line parameters namespace Return type: argparse.Namespace
-
preprocessor.standardizer package¶
This File contains the Standardizer class. To run this script uncomment or add the following lines in the [options.entry_points] section in setup.cfg:
- console_scripts =
- standardizer = standardizer.__main__:main
Then run python setup.py install which will install the command standardizer inside your current environment.
-
class
preprocessor.standardizer.standardizer.
Standardizer
(conf)[source]¶ Bases:
preprocessor.preprocessor.Preprocessor
The Standardizer preprocessor class
-
core
()[source]¶ - Core preprocessor task after starting the instance with the main method.
- Decide from the arguments, what method to call.
Args: args (obj): command line parameters as objects
-
parse_args
(args)[source]¶ Parse command line parameters
Parameters: args ([str]) – command line parameters as list of strings Returns: command line parameters namespace Return type: argparse.Namespace
-
Submodules¶
preprocessor.conftest module¶
preprocessor.preprocessor module¶
This File contains the Preprocessor class, it is the base class for DataTrimmer, FeatureSelector, Standardizer and SlidingWindow classes.
-
class
preprocessor.preprocessor.
Preprocessor
(conf)[source]¶ Bases:
preprocessor.preprocessor_base.PreprocessorBase
Base class for DataTrimmer, FeatureSelector, Standardizer, SlidingWindow.
-
core
()[source]¶ Core preprocessor task after starting the instance with the main method. To be overriden by child classes depending on their preprocessor task.
-
main
(args)[source]¶ - Starts an instance. Main entry point allowing external calls.
- Starts logging, parse command line arguments and start core.
Args: args ([str]): command line parameter list
-
parse_args
(args)[source]¶ Parse command line parameters, to be overriden by child classes depending on their command line parameters if they are console scripts.
Args: args ([str]): command line parameters as list of strings
Returns:
argparse.Namespace
: command line parameters namespace
-
preprocessor.preprocessor_base module¶
This File contains the Preprocessor class, it is the base class for DataTrimmer, FeatureSelector, Standardizer, SlidingWindow.