Conditional probability maps of extreme temperatures
Published:
So I created my first R Shiny app and I obviously think its pretty cool. Props to the Shiny team for making the learning process of how to make these apps so enjoyable.
Published:
So I created my first R Shiny app and I obviously think its pretty cool. Props to the Shiny team for making the learning process of how to make these apps so enjoyable.
Published:
There is a world of preprocessing methods from which a scientist has to choose. In each of the preprocessing tasks of cleaning, missing data handling, standardizing, encoding, binning, binarizing and etcetera, there are further choices to make. Some of these choices are data dependent, some not. Some are model dependent, some not. I want to talk about one such choice: z-score normalizing your training set before training.
Published:
So I created my first R Shiny app and I obviously think its pretty cool. Props to the Shiny team for making the learning process of how to make these apps so enjoyable.
Published:
I recently came across a very interesting problem. I wanted to verify if some of my data could be assumed to be normally distributed since I wanted to use statistical methods that assumed the same. One would assume that you would just run a statistical test for normality such as the Shapiro-Wilk or Anderson-Darling tests on the data and if the null hypothesis is not rejected, you would be okay to go ahead with the normality assumption.
Published:
I recently came across a very interesting problem. I wanted to verify if some of my data could be assumed to be normally distributed since I wanted to use statistical methods that assumed the same. One would assume that you would just run a statistical test for normality such as the Shapiro-Wilk or Anderson-Darling tests on the data and if the null hypothesis is not rejected, you would be okay to go ahead with the normality assumption.
Published:
The common suggestion that balancing classes in an imbalanced class problem boosts accuracy, either through oversampling the minority class or undersampling the majority class, is an over-generalization. In many cases, this is simply not true unless the minority class oversampling process includes data augmentation. Intuitively, this is because the amount of information in your minority class is fixed even if you oversample it (you’re just creating duplicates, which do not change the decision boundary). I realize this discussion is restricted to discriminative modeling.
Published:
I’ve been encouraged by colleagues to share my take on choosing a position industry or academia after graduating with a PhD. So here goes.
Published:
As I grow in maturity and have started taking on more responsibility in my roles, I realize that I’m starting to digest a lot of non-technical content that informs my views and beliefs. To complement my technical reading post, this is a list of non-technical reading that has shaped me. It isn’t necessarily in a particular order, and while I’ll add to it in reverse chronological order, they are all timeless reads as far as I’m concerned. The topics revolve around philosophy, management, organization, leadership, productivity, customer development, marketing, etc.
Published:
This is a list of select research works that I’ve read and like from most to least recently read, much like a communication of my stream of consciousness. I would like to think that I could have collaborated to write some of these and it is my dream that one day I might produce works like these.
Published:
There is a world of preprocessing methods from which a scientist has to choose. In each of the preprocessing tasks of cleaning, missing data handling, standardizing, encoding, binning, binarizing and etcetera, there are further choices to make. Some of these choices are data dependent, some not. Some are model dependent, some not. I want to talk about one such choice: z-score normalizing your training set before training.