Contributing
This page documents how to download/setup the project and contribute to the project Github.
Table of Contents
1 Setup
1.1 Clone the Github repo locally
Copy the project into a local directory called
kaggle-march-madness-men-2019/
.git clone https://github.com/YouHoo0521/kaggle-march-madness-men-2019.git
Move into the directory.
cd kaggle-march-madness-men-2019
1.2 Set up virtual environment
Create a new virtual environment called
march-madness
. Do this once.virtualenv venv/march-madness --python=python3.6
Activate the virtual environment. Do this before every session.
source venv/march-madness/bin/activate
Install required Python packages into the virtual environment. Do this every time
requirements.txt
file changes.pip3 install -r requirements.txt
- Update
requirements.txt
if the source code requires new packages.
- Update
1.3 Install the project package
This project is organized as a Python package. We can write the source
code in src/
directory and import it from elsewhere, such as in
notebook/
.
We can install the package (into our virtual environment) in development mode so that the changes we make to the source can be used immediately.
python setup.py develop
At this point, we can import our package from anywhere by calling:
import src # import entire package from src.data import make_dataset # import a module from src.data.make_dataset import get_train_data_v1 # import a function
2 Develop
By default, git
will point to master
branch, which is the
production version. We want to develop in a separate branch and merge
the changes back to master.
Confirm that you're in local
master
branch.git status
The first line should say
On branch master
. If not, rungit checkout master
Pull the latest updates from
origin/master
branch.origin
refers to the remote repo on Github, which is the official version of our code.git pull origin master
Create and checkout a new branch off of
master
. The following command is a shortcut for creating a new branch calleddev_logistic_regression
and moving into it.git checkout -b dev_logistic_regression
- Write code.
- Put reusable code in
src/
directory - Put exploratory analysis in
notebooks/
directory - Put scripts in
bin/
directory- e.g command line scripts for ML pipeline (data prep, training, cross-validation, evaluation)
- Put reusable code in
Stage changed files for commit.
git add new_file_name git add modified_file_name git add deleted_file_name
Commit changes locally.
git commit -m "Write message here."
3 Push Changes
When your code is ready to be checked in (after one or more local commits), you can push your local branch onto Github repo and submit a pull-request.
Push your local branch (e.g.
dev_logistic_regression
) to Github. This will createorigin/dev_logistic_regression
branch.git push origin dev_logistic_regression
- Go to project Github, navigate to your new branch, and click
new pull request
.