Krishna Agaram

A scraper for internal ASC

This is a short write-up about a scraper for IIT Bombay’s internal ASC website; it might be useful during the registration period, all available in one place.

Scraping is done using selenium and the web-driver for Google Chrome. The idea is to first login, and examine manually the HTML tags for the labels of the relevant table entries that need to be clicked. For example, running courses or grading statistics. The selenium.WebDriver.find_element method, and the option By.XPATH is golden for this purpose. The rest is simply trial-and-error routine python code.

The scraper is available here as an iPython notebook. The resulting run for the Autumn semester, 2023-24 is available here (it contains a list and description of all courses that ran, with grading statistics). For the results split by department, use this tarball.

The scraper is a good starting point to get your hands dirty with some web scraping, and also can be extended with very similar additions to scrape other parts of ASC, if needed. The results are easily subjected to some analysis, to find out courses with high average grades, or courses with high failure rates, etc. The possibilities are endless.

Use it wisely, and happy scraping!

But why is it so hard to build a habit? Might just light up some bulbs đź’ˇ

Most people know the familiar feeling of regret when they give up on developing a habit that they know is good for them. It’s a feeling that’s hard to shake off, and it’s a feeling that’s hard to ignore. It simply feels, well, unbearable to continue with the habit, and resistance wins. I have been wondering for a while now why this might be the case, and I will share here what I have found so far.

The ancients developed habits and patterns of thinking for immediate (and preferably, as comfortable as possible) survival - that was top priority. The modern age has made this mostly unneccessary, since we have the luxury of safety. However, the pattern of immediate comfort is still with us, though no longer optimal.

The brain is still wired to seek out immediate comfort and/or avoid immediate discomfort. The big problem with this is that the habits that are good for us in the long run are often very uncomfortable in the short run. Let’s look at how a habit forms. This analysis is mostly based on 1 by Hal Elrod.

A good habit generally takes thirty days to form. I would like you to take a habit you want - typically, becoming a “morning person” or churning out that “morning workout” works for most people - and imagine yourself through the process below of developing that habit.

The first two-three days comprise a sort of honeymoon period, where the habit is new and exciting, and you are constantly dreaming of yourself with the benefits of the habit. But then they do not, of course, show up after two days. The next week, it is ridiculously unbearable to continue the habit, as you believe less and less in its arrival. This is sort of like how we always expect that Amazon package to show up in two days, and when it doesn’t, we are pretty impatient every day after that.

A comical example of this phenomenon in real life is the abundance in parking space at the local gym just the week after New Year’s following long lines in the first week (yes, there was a study on this).

After ten days, it is excruciating to continue for one more day simply with willpower, and this is where it starts to make sense to move it to auto-pilot, with techniques like habit stacking. A few benefits might start to show up after two weeks, but they are not that significant. It is the uncomfortable phase of the habit formation process. You are not experienced, and so are slightly less reluctant to continue. The habit is not yet part of you, though, and its benefits are not great yet.

After three weeks, the habit has now become a part of your routine, and you start to associate with it. The benefits are now starting to show up, and you see the light at the end of the tunnel.

Quoting Hal Elrod in 1, “The third 10-day phase is crucial to sustaining your new habit, long term. The final 10 days is where you positively reinforce and associate pleasure with your new habit. You’ve been primarily associating pain and discomfort with it during the first 20 days. Instead of hating and resisting your new habit, you start feeling proud of yourself for making it this far.”

This is the unstoppable phase of the habit formation process. The habit is now slowly becoming a part of your identity, and you are starting to see the benefits of it. It is likely that you will continue the habit, since you have now seen the benefits of it in your life.

After thirty consecutive days with the habit, it has become a part of your identity and you will continue with it for a while.

It is difficult to build a good habit - there is no denying that - but it is far better to spend thirty days of discomfort to get a lifetime of benefits - pounds of pain to get that habit always beats tons of regret, logically speaking.

The problem is that as with the above process, the nice things about a habit are always delayed with uncomfortable things in the short run. Our wiring of the brain to avoid immediate discomfort is quite difficult to avoid.

However, when you realize that the benefits of a habit are simply delayed and not non-existent, and that the discomfort is temporary, it becomes slightly easier to continue with the habit.

To make it less difficult to build a habit, a few ideas follow.

If you think about it, we have an identity about what kind of person we are. For example, some people might think of themselves as “night owls”. The ancients had their own identities; they were known by their identities.

There is a difference between a person’s identity and character. Your identity is who you truly believe you, as a person are. Your character is how you function as a person. Some aspects of your character influence your interaction with the world, others influence your own personal feelings and happiness, and some are well, just hidden by will.

Anyways, no surprise that your identity determines, along with your environment and opportunities, your character.

The point is, a habit is simply a change in your character that you want to make. The best way to do that? Put it into your identity. You can’t control fully your environment (more on that later). But hell yeah you can control your identity. You can make it whatever you wish. Essentially, you now think of yourself as a “morning person” or a “workout person” to help build the corresponding habits of waking up early or working out.

Well, the obvious fallacy in this argument is that it isn’t true yet that you are a “morning person” or a “workout person”. How can you just “add” that into your identity when you have close to no overlap with it in terms of who you are currently?

Valid point. The solution is to start to make it part of your identity. That is, to affirm and re-affirm that you are becoming a morning person or a fit athlete. You are not lying to yourself here, and it is indeed true, when you start building the habit, that you are becoming a person with the habit. It is your identity now: I am learning this habit.

The great thing about this is that you are tricked into continuing, using momentum. After a few days, the fact that you are becoming a morning person is confirmed by your beginning practice of waking up early, and you reinforce the identity. The next day, your growing identity gives you a bit of a push to continue.

It is very much like the Duolingo streak, where you are tricked into continuing by the fact that you have a streak going.

The all-important thing is to really commit to becoming a morning person or a workout person. Only then can you believe that you are becoming one, and this identity trick will work. Committing is where 95% of the world struggles. They want something, but are unwilling to really decide to get it - because will it really be that bad now if I don’t make it?

Yes - in the ancient times, and so they didn’t need any tricks to force them to make it. Today, thanks to a reasonably stable life for most people, the answer to the question is usually in the negative. And so one concludes that one doesn’t need to do it so badly. The regret only comes years later, when the regret of not doing it is truly unbearable, knowing that the benefits of the habit would have been so great in compounding over time.

Once you are committed to building a habit, reinforce your identity as starting to become someone with the habit. With this identity and practicing the habit, after a week or so, your identity will already start to become “I am a morning person” or “I am a workout person”. That will go a great distance, and you have already won half the battle.

The discomfort of building the habit is now acknowledged as a necessary part of the process, and now that it is in your identity, you will forage forward because that is who you are. Slowly, but surely, the new identity will push the habit into your character.

The other thing that determines your character is your environment. You can probably guess what’s next. You want the habit in your character, why not put it in your environment too? Makes it even easier to get it into your character. Indeed it does! James Clear has a lot to say about this in 2, and for completeness I will try to present the essence of his ideas here.

Suppose the lights are turned off and your bedroom is dark tomorrow morning at 7:00 AM. Would you, if not used to being up this early, go out of your way to get yourself up and out of bed? I certainly wouldn’t. But if the lights are on, the sun is in your face, and you can hear the hustle and bustle of the day outside, you are much more likely to get up. I know I am. This is because your environment is now conducive to waking up early.

The time of day did not change. It was the environment that changed, which changed your reaction along with it.

James Clear suggests a ton of ways to make your environment conducive to building numerous habits, and I really recommend checking them out in 2; some of them have really helped me build some habits.

For now, since I think that a lot of people want to build the habit of waking up early, I’m going to detail Hal Elrod’s idea (in 1) of making it obvious to wake up early. If you want to build this too, I’d like you to imagine yourself through the process below.

It starts by moving your alarm clock or phone to the other side of the room - near enough that it still gets you out of your sleep and irritates you enough to want to turn it off, but far enough that it can only be reached with at least a few seconds completely off the bed. In the morning, you are woken up and forced to walk and turn off the alarm.

You are still probably only thinking about the discomfort of having to exist without that bed and blanket, and are really contemplating going back. You convince yourself it’s okay, since you are so tired and probably need the extra ten minutes of sleep before your next alarm. That’s when you do step two: you have a glass of water ready to drink right next to the alarm.

Most people are typically simply dehydrated in the morning, though it might seem like they are tired. Drinking a glass of water immediately after waking up can cause you to make the distinction. You drink the water, and are not really so tired anymore, but it’s still quite uncomfortable to be out of bed early, especially if there are people nearby who are still sleeping - because why me? The next thing is to go brush your teeth.

This is typically hard, and is a sort of commitment to waking up now. We will see how to make that easier in a bit. Once you brush, you have more or less committed to being up. As a final step, keep whatever you use for your routine - your workout clothes, your journal, your book, your phone, your computer - on your desk to make it easy for you to start the routine. Et voila, you are go for the day.

There are two things more important things that have existed as loopholes in the above process. The first is that - well, what if I am actually tired, even after drinking the water? Well, it is very likely then, that you have not slept early enough to wake up at this time.

Arguably a harder thing to fix is the sleep time - but that is precisely the thing that comes from your identity of starting to become a morning person. You are starting to wake up early, huh? You’ll need to sleep early then, otherwise you can’t. You are slowly committed to sleeping early. Another fix is simply to tell someone to force you into bed early.

Our parents did this when we were young. An interesting fix is to simply time the lights and the internet to go off at your bedtime, leaving you with no option but to sleep.

The second loophole is that of deciding to go to brush, i.e. starting the day at this time. Again, commitment is a fix. But there is another helpful technique here. It is known that the last thoughts before sleep is among the first thoughts after wake up. So, if you set your intentions the night before and take responsibility then for giving you some excitement or deadline for the next day, you are very likely to be pinged by the same thought in the morning, that can make all the difference in getting you to start the day then.

To summarize, we have seen two ways that can help make the process of building a habit less uncomfortable and more doable - by making it part of your identity and by making it part of your environment. Crudely put, the first is a mental trick, and the second is a physical trick.

Of course, there are others; one famous person times the following message “I am still slouching in bed” to be sent out publicly on Twitter at 6:00 AM the next day, and the embarrassment of that gets him out five minutes before, to push the sending back to the next morning. It’s what James Clear calls “Make it inevitable” in 2 - there is simply no other option but to follow through.

Another is what is called habit stacking (see 2), where it is in your routine to do something after/during something else you already do. For example, one techy kid connected his ipad’s internet to his gym cycle, so that he could only use watch Netflix while cycling. Or to time the internet to go off after dinner.

This is a great idea and acts like a walker till you can walk on your own. However, there is a caevat. It is a great way to start building a habit, but it is not a great way to sustain it. The idea is to make the habit so much a part of your identity and environment that you don’t need the walker anymore.

The last part of this article is about the habit-building process and how we are often miles off in our expectation of how it will be. The idea is very simple.

Most people overestimate what they can do in a day or week and underestimate what they can do in six months.

Many people might relate to this. I do, surely. The reason for this is quite simple - (1) our society at large believes in overnight successes, which are never close to the whole truth, never mind what the media says about it, and (2) we are terrible at estimating the power of compounding. This mindset is remedied over time, as you experience more. The important thing is to reinforce the fact that the mindset is not correct from your own personal experiences.

Of course, this mindset is not conducive to building habits, since the benefits of habits show up in a manner exactly opposite to what is expected. Understanding this mindset conciously gives a plethora of ways to build habits and avoid it from affecting you.

For example, one way to realize that “missing it twice” isn’t the way to go. Missing it once is okay. Circumstances could force it. But the moment you miss it once the circumstance is no more an obstacle, it compounds and is the beginning of a new habit - the inverse.

Another example is that of atomic habits, advocated by James Clear in the epoynmous book. He suggests that you should make your habit practice so small that they are easy to do, however badly you overestimate your day and realize a shortage of time. For example, if your goal is to wake up at 6:00 AM regularly, make sure for the first week that your alarm is set five minutes earlier than usual.

That’s it. It’s very hard to resist it now, since it doesn’t really seem like a change. The most important thing is showing up for the habit. Once five minutes earlier becomes your new normal, you can move it to ten minutes earlier, and so on. This is the idea of making the habit so small that it is hard to resist.

That the benefits of the habit will follow in full in due time follows from the fact that these changes compound crazily over time, however much our underestimation of the power of these atomic changes discourages us. Ask any bodybuilder.

In short, trust in the compound effect. The most successful people are not overnight successes. It has taken years of struggle to get to where they are. It is laughably stupid to think that you can get to where they are in a day or a week. But it is also equally stupid to think that you can’t get to where they are in a year or two. The power of compounding is immense, and it is the most important thing to remember when building habits.

To conclude, building a habit is not an easy process, but the right perspective and understanding of the process, coupled with a conducive environment and identity, can make it much easier. You’re en route to your wildest dreams, and the only thing stopping you is you. Give it a shot.

If this was a substantial light bulb for you, dear reader, I am glad for you. You see, a very large proportion of the world - 95% - is living a life of mediocrity and putting off (and regretting) the things that are good for them (see 1 for more details). I believe humans are far more capable than that, and I hope that a better future awaits us. I will try to do my bit. I hope you will too, now.


Acknowledgements: This is mostly just techniques from a few remarkable people and my own experiences as I have tried to build habits. I have tried to put them together in a coherent manner, and I really do hope it helps you, if you think you needed it.

References

  1. Hal Elrod, The Miracle Morning. website.  2 3 4

  2. James Clear, Atomic Habits. website. Also the Rich Roll podcast with James Clear. link.  2 3 4

Approximating the maximum weight of a triangle-free subgraph

This is a write-up of a problem that appeared on one of my course tests. The problem is as follows:

Let \(G\) be a simple undirected graph with positive integer edge-weights. A subgraph of \(G\) is called triangle-free if it does not contain any cycle of length 3 (aka no triangles). The problem is to find a maximum-weight triangle-free subgraph of \(G\).

The problem is \(NP\)-hard, so we might as well try approximating it instead. The most natural triangle-free subgraphs are the bipartite subgraphs of \(G\) - how large a bipartite subgraph can we construct from \(G\) in polynomial time?

Show how to compute a bipartite subgraph of weight at least \(W/2\), where \(W\) is the sum of the weights of the edges in \(G\).

Hint 1: The problem is a classic example of a greedy algorithm.

Hint 2: Can you guarantee that for every vertex \(v\), the edge weight between \(v\) and its neighbors in its part is at most the edge weight between \(v\) and its neighbors in the other part? That would imply the problem.

Hint 3: Starting from an arbitrary partition and then greedily moving vertices across the parts may not be a good idea. Start from empty bins. There is a \(\mathcal{O}(n + m)\) time algorithm.


Solution: Set up two empty bins, which will serve as the bipartition. Order the vertices arbitrarily - \(v_1, …, v_n\). Starting from \(i = 1\), put \(v_i\) into the bin with the smaller total edge weight to the vertices already in the bin. Notice that each edge of the graph will be looked at precisely once, and at least half the total weight of the edges goes across the bins. The running time is the sum of the degrees + \(cn\) overhead - \(\mathcal{O}(n + m)\).

The motivation for this algorithm comes from the following inductive proof of the existence of such a subgraph. We induct on the number of vertices \(n\) - the statement is vacuously true when \(n = 1\). For \(n = k+1\), delete an arbitrary vertex \(v\); the rest of the graph, by hypothesis, has such a subgraph - suppose the partition is \([U, V]\). Finally, put \(v\) into the bin with the smaller total edge weight to the vertices already in the bin.


Before we get back to our original problem, let us consider the "complement" problem: finding a minimum weight subgraph \(H'\) such that every triangle in \(G\) contains at least one edge of \(H'\). It is clear that any solution \(H'\) when complemented in \(G\) is triangle-free. Can we manage to get a low-enough solution to the complement, then?

Let \(\text{OPT}_2\) denote the weight of the minimum weight subgraph \(H'\) such that every triangle in \(G\) contains at least one edge of \(H'\). Describe an algorithm that computes a subgraph of weight at most \(3\text{OPT}_2\).

Hint 1: LP rounding/Primal Dual.

Solution: We first write down the LP (linear program) corresponding to the relaxation of the complement problem. Letting \(x_e\) denote the variable representing whether edge \(e\) is in our subgraph or not, we have the following LP relaxation:

\(\begin{aligned} \text{minimize} \quad & \sum_{e \in E} w_e x_e \\ \text{subject to} \quad & x_{uv} + x_{vw} + x_{wu} \geq 1 \quad \text{for all triangles $uvw$ in $G$} \\ & x_e \in \{0, 1\} \quad \text{for all $e \in E$}. \end{aligned}\)

This is actually an instance of the [set cover](https://en.wikipedia.org/wiki/Set_cover_problem) problem, which has an \(f\)-approximation rounding/Primal Dual algorithm (where \(f\) is the maximum number of sets containing an element of the universe - here, the elements are the triangles of the graph \(G\) and the sets are the edges; each edge covers triangles it is a part of and each triangle is covered by precisely 3 edges).

The rounding algorithm is easy; we solve the LP relaxation optimally, and then round every \(x_e\) with \(x_e^* \geq 1/3\) (\(*\) denoting the optimal solution) to \(1\), and the rest to \(0\). The resulting solution is integral, feasible for the integer linear program, and has scaled up the objective by at most a factor of \(3\).

The Primal Dual algorithm is also fairly simple, but we do not go into that here, at the moment.


We are ready for the first approximation algorithm for the original problem.

Let \(\text{OPT}_1\) denote the weight of the maximum weight triangle-free subgraph of \(G\). Describe an algorithm that computes a subgraph of weight at least \(\frac{3}{5}\text{OPT}_1\).

Hint 1: Leverage both algorithms we have already seen. How do you leverage two algorithms to get the best of both worlds?

Hint 2: Suppose the optimal value \(\text{OPT}_1 \leq \frac{5W}{6}\). Are we done? If this is not the case, does the algorithm we have for the complement problem help us?


Solution: Simply run both algorithms (of course, after running the second algorithm and getting a subgraph \(H'\) we must complement it in \(G\) to get a triangle-free subgraph), and take the larger of the two solutions. That way we are able to avoid the worst-case scenarios of each algorithm individually, if the worst-case scenarios are different for the two (they happen to be beautifully complementary in this case). Let \(h_1\) be the weight of the graph returned by the first algorithm, \(h_2\) by the second, and \(\mathcal{A} = \max(h_1, h_2)\) is the value returned by our final algorithm.

We have:

\[\begin{aligned}h_1 &\geq \frac{W}{2} = \frac{3}{5}\cdot\frac{5W}{6},\\ h_2 &\geq W - 3\text{OPT}_2 = W - 3(W - \text{OPT}_1) = 3\text{OPT}_1 - 2W.\end{aligned}\]

If the optimal value \(\text{OPT}_1\) is at most \(\frac{5W}{6}\), then indeed, \(\mathcal{A} \geq h_1 \geq \frac{3}{5}\text{OPT}_1\). If this is not the case, then \(\text{OPT}_1 > \frac{5W}{6}\), and

\[\mathcal{A} \geq h_2 \geq 3\text{OPT}_1 - 2W \geq 3\text{OPT}_1 - 2\left(\frac{6\text{OPT}_1}{5}\right) = \frac{3}{5}\text{OPT}_1.\]

Incredible, right?

It is also easy to see that bettering each of the two algorithms individually could improve the approximation factor of the final algorithm. The first algorithm is actually the best we can hope for; the max weight of a bipartite subgraph of the complete uniform-weight graph on \(n\) vertices is \(W\left(\frac{1}{2} + o_n(1)\right)\). The second algorithm, on the other hand, can be improved (thanks not to the fact that it is a set-cover; for the set cover that is the best we can hope for with that formulation, since the integrality gap is \(3\)) to an approximation factor of \(2\)!

The factor \(2\) approximation then establishes a \(2/3\)-approximation for our final algorithm, by completely analogous reasoning as above. It remains to show the \(2\)-approximation for the second algorithm.


Describe an algorithm that computes a subgraph of weight at most \(2\text{OPT}_2\).

Hint 1: Induct on the number of edges of \(G\). You wish for some \(x_e\) to be at least \(1/2\) so you can round it up to \(1\) and only scale the objective by a factor of \(2\). If there is an edge \(e\) with \(x_e^* = 0\), use induction to reduce the problem to a smaller graph. You might need to have to solve the LP optimally each time down this iterative reduction, this is okay.

Hint 2: The bad case is then when every \(x_e\) is positive. Complementary slackness comes to the rescue. Compute the exact dual optimal solution. Is this small enough that a \(2\)-approximation to this is quite easy to find?


Solution: We induct on the number of edges in the graph.

(the reader is invited to finish the solution until the author gets around to it 🫣)