Week 7-9: First Python-based Tool!

My experience writing a Python tool that scraps number of citations of papers.

I started to pick up the basics of Python in the past few weeks – thanks to a 7-day hotel quarantine and a misaligned jetlag. I have been following Al Sweigart’s free to read book (and £13.99 course on Udemy) – Automate the Boring Stuff with Python. Last week, I’m proud to have written myself a little tool using Python!

Citation Scrapper

Have you ever had a list of papers titles and thought “Hmm.. Wouldn’t be nice if they are sorted by number of citations?” This little gadget is the tool for you! (Yeah I am selling it too much >v<!) “Number of Citation” information is not readily available on Databases (apart from Scopus Web of Science). Fortunately, this information, whilst less reliable, is available on Google Scholar. The tool doesn’t do anything ground-breaking – you feed the program a list of paper titles, it scraps and print the number of citations of those papers on your spreadsheet.

There are existing solutions on the market that achieves this already, such as the Publish or Perish citation tool. I just thought this could be an entry-level task to test myself. “Written” is truly an overstatement – it’s more like copying and adapting codes from GitHub and Stack Overflow. But the sense of accomplishment is real.

Sense of accomplishment is real!
Photo by Temo Berishvili on Pexels.com

One barrier I encountered was that, whilst the codes appear to work quite well independently when I was testing them, they do not seem to be performing consistently. One hour it worked, the next hour it stopped working. The codes were identical, I couldn’t understand how it wasn’t working. I was in hotel quarantine when this problem first appeared, and I was joking to my brother that I must have been blocked by Google – which I later realised was exactly the case!

Turns out, scrapping information from other people’s website may violate their terms and conditions – and could be borderline illegal. Sites like Amazon and Google (and many many others) set up timeouts that automatically blocks IP addresses when they detected a large number of requests (accesses/searches) within a short amount of time. I did not put in a time-out in my original codes, which sends in thousands of searches in minutes. No wonder I was blocked out!

Anyhow, this experience of testing and problem-solving has been fun! I began to understand more about the magic that fuels enthusiasm within the programming/software engineering community. I’m eager to be in a position to contribute to the conversation soon – one day I shall!

To Be Part of the Community!
Photo by Pixabay on Pexels.com

Weeks 5/6 – Time Management Tool: Eisenhower’s Urgent/Important Matrix

Sharing my experience using the Eisenhower’s Matrix & reflection on “time”.

Former US President Dwight Eisenhower was said to have popularised this time management system. By classifying tasks by it’s importance and urgency, Eisenhower’s Matrix was described as the holy grail to minimise distractions. I am sharing 2 problems I have with the matrix, how it does not fit my workstyle, and some wider reflections on living a highly-structured work/life-style.

The Eisenhower Matrix
created by Lighthouse Visionary Strategies

Problem 1 – Everything is Important!!

I’d hate to think it is only me, but a great fallibility of mine when I started to use the matrix was that everything I thought of seems to be very important! At the beginning of my new role, there’s quite a bit of admin required setting up certain accounts, getting data access, or signing up to the relevant mailing lists etc. It meant that multiple conversations within and across institutes/ departments happened at the same time. It doesn’t make a lot of sense to rank or compare these tasks as they don’t appear to be too important, but I couldn’t do my job if these aren’t completed. On the other hand, I have got a list full of publications I am eager to catch up on the topic. I conflated “important to my job” and “important to satisfy my research interests”, and have been judging the importance of tasks with a fluctuating standard. This soon corrupted my matrix, with some tasks that are popping on and off every 2 days, and some staying on the matrix for eternity! Consequently, the bottom right quadrant – Not Urgent and Not Important – was always empty. I failed to utilise the tool to it’s fullest.

Everything is Important!
Photo by Monstera on Pexels.com

Problem 2 – Poorly Defined “Tasks”

The matrix is meant to be a task-focused tool, and not a progress-tracking tool to help facilitate learning. Continuation on the “never empty” tasks, apart from the misjudged importance, it is also the nature of the tasks that made them so difficult to tick off. An example is : learn Python. It is a key component of my work, highly important, probably quite urgent too [depends on what timescale we’re talking] if I want to have any real progress. But I could never cross off that task and call it done: even after I have completed 20 hours of tutorial videos, worked through a textbook, and coded my first little gadget on Python, I don’t feel confident enough to say that I have “learned Python”. The matrix is not meant for progress tracing, but rather for shortening to-do-lists. Some could argue that it was rather my non-SMART goals that the problem should be attributed to, and I shouldn’t judge the capabilities of the matrix based off that [SMART = Specific, Measurable, Achievable, Relevant, Time-bound]. However, I do think it is not realistic to map out the whole learning process into tiny bits of surrogate markers of achievement. Does the ability to copy-and-paste multiple sections of codes from GitHub mean I am capable of doing a task? How many errors or test and failures are tolerable to develop a new python gadget for a “good” coder? Was it the “coder’s mindset” I should be valuing, or should I be taking examinations to benchmark my progress? The checklist approach to learning did not work for me.

Time Time Time.
Photo by Ron Lach on Pexels.com

Reflections:

We find comfort in structure. We needed the structure guide our attention, to renounce our mastery over time. Time is being broken down to smaller units with higher precision to monitor progress, efficacy and production. We sure are living in a faced-paced world, but it is not just the pace, but the accuracy and rigidity of time has consequentially projected itself as the more appropriate way of living, as the “truth” that is more true than how time is experienced in the past. The passing of time is universal (well, sorry theoretical physicists), but the construct, measurement and experience of time is manufactured and constantly updated by the society, by us. We fabricated this need for speed that in turn necessitates the need for more precise measurement. In cultures where the obsession on time has (yet to) taken over, e.g., in African Culture, their way of living and experiencing time was often remarked pejoratively. Injustice might masks themselves as progress; Greed as philanthropic; Derogation as inclusion.

Despite our emphasis on time, and the structure that comes with it to help us master time, not having spent enough “Quality time” with loved ones was said to be one of the most common regrets on the death bed in modern times. Not all “times” are born equal. Our ability to just relax and enjoy the moment are being chipped away, checkbox by checkbox. The guilt of wasted time spill over and burdens us even more. I am sure the structure has helped a lot in the industrial revolution to get the factories rolling, perhaps it will serve a similar role as AI replace half of the labour force. How do we find quality in our time, befriend time and not to compete with time? Tools like the Eisenhower’s Matrix should help us build this healthy relationship with time, not to see ourselves as the Lords of time. Be humble!

[Finished reading Beyond Measure by James Vincent, whom described the history of measurement of time quite nicely.]

Week 3: Marathon, Not a Sprint

Week 3: PhD thoughts inspired by a recent 5k run.

Last Saturday, my partner and I participated in the 5km Parkrun nearby. We’re all dandy, in other words, untrained. This is the first time we both are able to free ourselves from the shackles of the comfort from our beds on a Saturday morning.

The goal was to finish the run in one piece. We started off on a nice pace, dangling at around 400th place out of 600+ runners. Unlike the last time I joined, there is no muddy piles from rain. Little bits of tailwind accompanied the sunlight to give us an extra boost.

This extra boost came back to haunt us in unexpected ways. We were too used to running on treadmills, and we could not adapt to the natural landscapes. The tailwind must have also pushed us beyond our typical pace. My partner went slightly over her limit as her knees started to complain as we crossed the half-way point. We had to slow down.

As we squirm forward at the speed of rush-hour traffic in London, I started to feel the urge to just dash off and catch up with my pace. I reckon we must be at the tail of the crowd! My inner competitiveness wants to take over, it’s such a nice opportunity to set a personal best! My partner adds fuel to the fire and encourages me to go, “just wait for me at the finish line!”. Indeed, why shouldn’t I think less and run?

As I fall into the conundrum, I see how the situation somehow resembles my PhD journey. What is it that I value in this process? Was it to finish it as fast as I could in record breaking time? Or was it to take my time in learning, doing slow but meaningful science? It’s never either or, but setting a goal and stick with it would help me prioritise what’s truly important to me. At this point, it is to cherish my status as a student, to dive into theoretical puzzles, challenge myself with new skills, connect with people I dare not speaking to, and spend time with the ones I love.

We crossed the finish line together.

Week 2: Getting Real

Week 2 is a philosophical one. More reflection on how this world operates.

Week 2 is much less eventful comparing to week 1. It is likely a more truthful depiction of a typical week in the coming 3 years.

Measure and Routine Practices

Why we do what we do the way we do it?

Several constellations lined up to trigger this train of thought. I recently finished listening to Desperate Remedies: Psychiatry’s Turbulent Quest to Cure Mental Illness by British Sociologist Andrew Scull, whilst starting James Vincent’s first book, Beyond Measure: The Hidden History of Measurement. A challenge faced by psychiatrists in the 1970s as they put together DSM III was not a new one. It is a problem of establishing a reliable measure. As the French tried to establish the metre, the Chinese Emperors defining the tunes, and the Egyptians keeping time – to be reliable in what they measure. A proper measurement often relied on a naturally occurring (hence valid) phenomenon to establish it’s reliability, which is relatively easy to do for some of the things, etc. how sundials and waterclocks were used to track time. Mother nature became their guarantor. For other constructs, like friendship, happiness, rights and responsibility, we are less capable to do so, or at least haven’t found a way to reliably doing so yet. How we measure things tell us a lot about our understanding (or the lack) of the phenomenon.

Photo by Moose Photos on Pexels.com

The same applies to the research in health equity. What is being recorded and how they were recorded matters. And these directly influence what is available in our routine administrative data. For example, indicating the poor uptake of psychological therapy in an ethnically diverse catchment area do not simply mean that there is a strong stigma, but perhaps more entrenched distrust in the system, lack of support for people to access services etc. Moreover, alternative support provided by community members, cultural practices and are merely not recorded, and discounted from routine records. From this snap shot understanding of the “evidence” for poor therapy uptake, what could be a proper policy in response? It is impossible to tell just by data, and this is because of how we decided to frame and measure access.

It begs the question, who decides what to measure and how? Under this veil of evidence-based policy making, which people groups are routinely under-represented? I reflected on some of these question in my blog earlier this week (Reflecting on Ethnicity in Research – Challenging the Default). These are the questions I will keep in mind and keep interrogating myself as I carry on with my PhD research.

Learning Python

Starting to experience once again the joy and frustration of learning a new program. Successfully installed relevant packages – celebrates! Failed to reliably call my virtual environment – felt defeated… I have been forking people’s repos on Github but struggling to understand the process… Would appreciate any tips on picking up Python!

Week 2. Solid 6.5/10.

Week 1: The Beginning

First week of PhD, thoughts on Remote Start. Software Nightmare. and Academic Career Progression

“Welcome to UCL” – 10 online induction courses (no kidding!) but I doubt I’d remembered a lot from them. Possible true that it is to leave a gist, an impression of what the college values: fire safety, implicit bias, data security… All is well! Changing jobs are never simple, and doing this in a remote-working era makes it … a bit weird? But I presume it is something for all of us to get used to.

The plus side of everything going online is that, I get to attend A SWARM of online talks, seminars and groups. It does feel a bit overwhelming to start – my schedule is quickly populated with scheduled meetings and invitations, there are always this prominent speaker coming, that core training one cannot miss – can’t help but wonder – will I ever attain this wisdom to determine which talks are the truly good ones I should listen? Would be a thing to reflect perhaps a few weeks down the line..!

A Virtual Office (Unsplash)
Office. Photo by Laura Davidson on Unsplash

Software Nightmare

The excitement of starting at a new post was quickly overtaken by the frustration of – you guessed it – installing the relevant applications and softwares on my laptop! Numerous emails, calls and remote access sessions but still not able to get all I need. There must be more flexible ways for colleges to adapt to this fast-changing landscape of software development! Take Python as an example, the only version that is easily installable via college software centre is version 3.6.4, which failed to satisfy a lot of the dependencies of many recently developed softwares. Guess it is always this tug of war between data system safety & integrity vs freedom & flexibility…! Hope they all get sorted next week!

Imagining an Inclusive Academia

UCL provides a clear guidance on career progression – the Academic Career Framework (see below) – with a comprehensive list of things one is expected to achieve at UCL Grade 7 and above. This is something I have never heard of! It provides a substantive structure into what is needed to progress at UCL, in other words, things that are (currently) valued by the college.

4 Components of the Academic Career Framework (UCL)

It is said that contributions to all 4 categories is necessary to measure one’s achievement. I appreciate the attempt to provide clarity on progression, and I can see a wider potential of these frameworks to revolutionise academics’ roles in society – As Dr. Nadia Islam rightly put, a community-focused collaborative role needs to be more heavily emphasised in academic research. This would be a part of a change I would like to see, and contribute to in academia in near future!

I think this pretty much wraps up week 1 – excited to continue to embark on this journey, and hope that you will be adjourning with me 🙂