I have a panel of stocks with these columns: Date, Ticker, Ret, MTB. I want to rank the stocks based on the MTB value, every month, and put them into 10 groups. This procedure is needed when I want to make a portfolio. I used to do this very easily in SAS using the following code :
proc rank data=dt out=dt group=10;
by Date;
var MTB;
ranks MTB_Rank;
run;
In Python, however, I couldn't find a straightforward method to rank and group. I used the following method to rank and group by date but the answer is not satisfying!
***dt['MTB_Rank'] = dt.groupby(['Date'])['MTB'].transform(lambda x: pd.qcut(x, q=10, labels=False, duplicates= 'drop'))***
The problem is that it doesn't create 10 groups for every date. Infact, it just creates 7 groups for the whole data and each date might only have some groups. I also cannot remove "duplicates= 'drop'" because then the code shows me an error that the Bin edges must be unique. I would be thankful if you can guide me through this problem. Thank you !!!
it's sometimes better to split into smaller steps.
1. Sort your DataFrame properly
2. Create a simple iterator column (same as Rank):
3. Refresh group after threshold: