So you want to be a data scientist. I’ve compiled some resources for the budding and perhaps the experienced data guy to use. There are online academic courses hosted as MOOCs on edX, Coursera and Udacity, see the list below. As well, some of the skills needed are in section 2. A compilation of some of the classic problems in data science can be found in the next section.
Online Academic Courses
Below is a list of online resources for those interested in data science. Much of the course material including lectures, assignments, exams, solutions, slides, readings, notes and discussions can be found by clicking on the relevant link.
I completed some or all of the work for those university names in bold. For example, I recently completed MITx’s The Analytics Edge on the edX platform, while Berkeley’s Apache Spark class has just started.
The data scientist needs skills in statistics, programming, databases, visualization/graphics and computer science. Below is a non-comprehensive list of some of those skills that I have some familiarity with. Of course, as time goes on the list of necessary tools and skills will grow since DS is by its very nature an interdisciplinary field.
Python is the most popular scripting language and the many open source modules are proof of a wide spread and growing adoption of python as the language of choice by many in academia and industry. You may meet a perl guy someday and you may wonder how they missed the python boat. Below is an appended version from this github repository which gives a much broader view of useful python libraries for data science:
- Fundamental Libraries for Scientific Computing: IPython Notebook, NumPy, pandas, SciPy, pySpark
- Math and Statistics: SymPy, Statsmodels
- Machine Learning: Scikit-learn, Shogun, PyBrain, PyLearn2, PyMC
- Plotting and Visualization: Bokeh, d3py, ggplot, matplotlib, plotly, prettyplotlib, seaborn
- Data formatting and storage: csvkit, mrjobs, PyTables, sqlite3,lml, BeautifulSoup, wget, curl
The R-programming language is used by many academics and data science practitioners for exploratory data analysis. R has enormous overhead and can be very slow especially for memory intensive use and so is typically not used as a production environment. However, there are hundreds of useful libraries built on one of the oldest statistical platforms that implement machine learning algorithms, data visualization graphics and data manipulation. As such, it is one of the best known languages for data science. A small sample of some of the R-libraries that a DS might find useful are below. Before you start using R, download R-Studio, an IDE of the best kind; free, fast, full of features and fantastic.
- Machine Learning and Natural Language Processing: arm, caret, caretEnsemble, caTools, chron, cvAUC, cvTools, doParallel, dynamicTreeCut, e1071, flexclust, gbm, glmnet, kernlab, lpSolveAPI, mice, neuralnet, randomForest, rattle, ROCR, rpart, RWeka, SnowballC, tm
- Statistics and Time Series: digest, aod, lsr, methods, multilevel, psych, zoo, quantmod, Quandl, QuantPsyc, sm, stats, UsingR
- Data formatting, manipulation and storage: RCurl, RODBC, RMySQL, RPostgresSQL, RSQLite, sqldf, xlsx, lubridate, dplyr, plyr, base64enc, data.table, downloader, jsonlite, manipulate, methods, multilevel, reshape2, XLConnect, foreign, XML
- Plotting and Visualization: ggmap, ggplot2, googleVis, gclus, jpeg, lattice, maps, pROC, RColorBrewer, rgl, rpart.plot, shiny, shinyapps, vcd
Industry makes extensive use of matlab; it is not an inexpensive piece of software and add-ons are additional expenses. However, many engineering students were taught matlab and everything that you can do for free in python and R, you can do in matlab for a price. The open source version of matlab is Octave, which is not as widely supported or used as any of the other programming tools used in DS. That said, matlab is a wonderful vectorized programming language that handles much of linear algebra in a natural and uncluttered way. Andrew Ng’s Machine Learning course made extensive use of matlab (or octave) and is probably the best advertisement for this language for the data scientist.
Real programmers use C/C++ and real (old) scientists use Fortran, but this is a post about Data Scientists and good luck getting all of the useful modules, libraries, packages, data visualization tools, example algorithms, sample code and the large community in all three camps to convert everything to C. And for what? A thousand-fold increase in speed.
Oh yeah, C provides an almost thousand-fold increase in execution speed over R, matlab and an order of magnitude over even the best written python.
That’s a pretty compelling reason to use C. R, python and matlab all have C interfaces for those guys who need the speed which includes anyone who programs industrial strength code for actual use. Don’t get me wrong, I learned C from Kernighan and Ritchie but I have no interest in shoe-horning C into three other quirky languages.
Julia to the rescue. Julia is a rather new language that caters to the pythonistas, the matlabbers and the R-cran bloggers with a simple language structure but the speed of C; check out the benchmark pictured below. I’m still in the process of learning Julia, but so far, so good.
Everything you can do in R, matlab or (almost) python, you can do with SAS or SPSS. I learned SAS many years ago while on a company paid educational junket to Chicago where I also learned that some bars in Chicago stay open until 4am and even to 6am which made SAS a blur to me a few hours later. Enough said. If anyone wants to add SPSS information, please let me know.
Industry and industrious fellows have created many tools for big data manipulation, analysis and deployment. This section will be expanded as time permits.
o map reduce,Hadoop components HDFS Cloudera/HortonWorks MIR programming
o Sqoop; loading data in HDFS o Flume, scribe – unstructured data
o SQL w/PIGo DWH with Hiveo Scribe, Chukwa for Weblogo Mahouto Zookeeper Avro
o Storm: Hadoop realtimeo Rhadoop, RHIPEo rmro Cassandrao MongoDB, Neo4j
o Weka, Knime, RapidMinero Sparko Scribe, Chukwao Nutch, Talend, Scraperwiki
o Webscraper, Flume, Sqoopo NLTKo RHIPEo IBM Languagewareo IBM ManyEyeso Tableau
» Informatica, IBM DataStage, Ab Initio, Talend • Data Warehouse » Teradata, Oracle, IBM DB2, Microsoft SQL Server • Business Intelligence and Analytics » SAP Business Objects, IBM Cognos, Microstrategy, SAS, SPSS, R » Apache Flume, Apache Sqoop, Apache Pig, Apache Oozie, Apache Crunch • Data Warehouse» Apache Hadoop/Apache Hive, Apache Spark/Spark SQL • Business Intelligence and Analytics » Custom dashboards: Oracle Argus, Razorflow
Classic Examples and Algorithms
testing Vs training, validation, regularization, cross validation Regression examples; linear, logistic, decision and classification trees perceptron Polls and election predictions handwriting recognition – zipcodes image processing text analytics google n-grams bag of words – corpus sentiment via tweets natural language processing recommendation systems clustering learning algorithms; supervised, unsupervised, support vector machines kernel methods neural networks optimizations; linear, integer, convex
From Berkeley CS100.1x
The difference between descriptive and inferential statistics.
kNN (k Nearest Neighbors)
Support Vector Machines
Latent Dirichlet Allocation
US National Institute of Standards and Technology primer on Exploratory Data Analysis.
The five-number summary
Introduction to Probability and Statistics
Big Data XSeries
Spark’s mllib library
Here is a good description of the difference between descriptive and inferential statistics.
Today, I want to share with you some observations and some questions about how FY2015 property tax bills were determined in Arlington. A video presentation of part of this blog post can be found here or watched below.
Let us start with the overall conclusions. Changes in the 2015 property tax bills in Arlington were driven by four factors accounting for 95% of the $3.6M increase in property taxes collected.
1. Owners of commercial properties and apartment buildings saw an overall decrease of about 1.7% in their tax bills, driven by no change in their 2015 property assessments combined with a 1.7% decrease in Arlington’s tax rate.
2. Condominium owners saw an across the board increase of about 5% in the value of their buildings accounting for 27% of the total tax increase for 2015.
3. 65% of the increase in property taxes was achieved by arbitrary changes in land valuations, based on precinct boundaries, with east Arlington seeing a 22% increase in their land values, 9x higher than land values in and around Jason Heights. This land value increase is not supported by recorded sales.
4. Sales of commercial properties and some large apartment complexes were not considered in the assessment process resulting in more than $30M in decreased tax assessments; most notably at the Mill Street Apartment complex.
5. Arlington’s changes in assessments appears to disproportionately affect residents in east Arlington and does not compare favorably with some surrounding towns.
Commercial, Apartment Building, Condos and Land Values
All of the snapshots and discussion below can be found in the video here or by using this interactive GIS map that lets you seamlessly toggle between commercial properties, condos and residential land values.
Commercial Properties – 1.7% Decrease in Tax Bills
First, let us look at Arlington’s commercial properties. Whenever I use the term ‘commercial’, I’ll also be including the much smaller industrial properties. Commercial properties make up about 4- 5% of Arlington’s total tax base, pretty steady over the past decade and down from about 8% 20 years ago. I’ve outlined the commercial properties in Arlington in bright red, while apartment buildings are outlined in green. A few things pop out. First, notice that commercial properties follow the Mass Ave corridor west (left) to east (right), forking in Arlington center along both Mass Ave and Broadway.
Let’s zoom in a bit around Arlington center. I’ve filled in the commercial properties with three different colors. White properties had no change in their assessments. Properties filled in with a reddish/pink color saw an increase in their assessments, while parcels colored light blue saw decreases in their assessed values.
Note the two large parcels filled in with a reddish color on the top and left sides of the map. At the left side of Arlington, near Arlmont, is a small portion of the Belmont Country Club, while on the top, or North side of Arlington are the 45 acres or so of the Winchester Country Club that is in Arlington.
Now, for our first observation. All but a handful of commercial properties, filled in with white, saw their assessed values stay the same for 2015. In fact, out of the 400 or so commercial properties in Arlington, 12 parcels saw an increase in their assessed values including the two golf clubs (4 parcels), a spit of land on the Mystic Lakes owned by the Medford boat club, the newest Housing Corp purchase by Downing Square and half a dozen, random properties in and around Arlington center.
Almost every commercial property in Arlington saw no change in their assessment. Combined with a 1.7% decrease in the tax rate, results in the observation that commercial property owners saw a decrease in their overall tax bills. This reduced the amount of property tax collected by $70,000.
This brings up some questions. Why did commercial properties see no increase in their market values? Why did the town decrease their tax bills?
Two commercial properties saw a decrease in their assessed values this year. 659-671 Mass Ave is a handsome commercial block in Arlington center, across from the Robbins Library, owned by Charles Blumsack and home to Domino’s Pizza, Thai Moon restaurant and Involution Studios – designer of the Town’s budget visualization. The assessment on this commercial property went down almost 11%.
The second property that saw a decrease was way down in east Arlington, along Sunnyside Avenue. The property is owned by Harry Allen and occupied by the Arlmont Fuel Company which saw a 22% decrease in property assessment.
This brings up some other questions. Why did these properties see a decrease in their assessments? What was the process by which these property owners got their tax bills decreased? Did these property owners go through the regular abatement process?
Apartment Buildings – 1.7% Decrease in Tax Bills
Now let’s look at the apartment buildings in Arlington. I’ve outlined in green all 8+ unit apartment buildings in Arlington and filled in the parcels as before. White means no change in the assessed values, pink an increase and light blue a decrease. Let’s zoom in and take a look. Notice that all but two apartment buildings in Arlington saw no increase in their 2015 assessed values. The two exceptions are a 33% increase in the Arlington 360 complex built on the Symmes Hospital site and a 2.8% increase at the Mill St. apartments on the former Brigham’s site.
This brings up the second observation. 71 of the 73 large apartment buildings in Arlington saw no increase in their assessed values for 2015. Combined with a 1.7% decrease in the tax rate means that apartment building owners saw a decrease in their tax bills; decreasing the amount of property tax collected by Arlington by about $75,000.
This raises several some questions. How did Arlington’s assessors decide that there was no increase in the market values of apartment buildings in Arlington? Are apartment buildings determined by the income method of assessment? Were rents in Arlington steady throughout the year?
Recap – Decreases in Tax Bills
To recap, this view shows all commercial properties – outlined in red – and apartment buildings -outlined in green- and their change in assessment, which is almost exclusively no change, accounting for a combined decrease of about $150,000 or 4% of the total increase in taxes for Arlington in 2015. We will gray out these properties for the remainder of this post.
In this view, we look at all condominiums in Arlington. The condo properties are outlined in blue. One important thing to consider is that condos do not have a separate land assessment. This will become important in the next segment. As you can see, most condos can be found in east Arlington and south (towards the bottom of the map) of Mass Ave.
The property parcels are filled with pink if the change in assessment was between 4% and 5.2%; white if the change in assessment was less than 4% and light blue if the change was greater than 5.2%.
Zooming into east Arlington shows that the great majority of condos saw assessment changes between 4.5% and 5% regardless of their building characteristics. A couple of interesting outliers. One was on Hamilton Road where the end units saw average increases of 12.3%, same as 993 Mass Ave. Another interesting change was at Colonial Village in Arlington Heights that saw a 24% increase in its assessment across the board.
Why were most condos increased at the same rate, regardless of location or unit sizes? Why did Colonial Village see a large increase in their building values relative to all other condo complexes in Arlington?
The more than 3300 condo units in Arlington represent about 22% of all properties. The taxes collected for condo properties increased by about $1M overall. Since all taxes collected increased $3.55M and commercial/apartments got a $150K tax decrease, condos accounted for $1M/$3.7M or about 27% of the overall increase in taxes collected, about 5% more than expected.
Residential Land Values Increase 22%
Now for the lions share of the increase in Arlington’s property taxes collected in 2015. Lets switch gears a little bit and look at residential properties.
First, we look at changes in land values. We grayed out all of the commercial properties and apartment buildings that saw zero increase in their assessments. We also grayed out exempt properties which includes town owned property, schools, churches and other entities that don’t pay property taxes. We broke the land assessment changes into five groups.
Properties colored in blue saw assessment changes of land values less than 4%, pink 4-8%, green 8-15% and yellow 15-30%. Properties colored in white was no change, teal a decrease and brown a greater than 30% increase. This color scheme will show all condominium properties colored in white since condos do not have a separate land assessment.
We used a range, but typical values were 8% for green 5.4% for pink, 2.4% and 3.5% for blue and a whooping 21.6% increased in land assessments for properties colored yellow.
What we see right away is that changes in land assessments are clustered; east Arlington saw a 21.6% change in its land values. Residents of Jason Heights saw a 2.4% change. The Morningside area saw an 8% change and those in between saw a 5.4% change. Adding precinct boundaries allows for a stunning observation. Land value changes conform to precinct boundaries.
The change in land assessments was based on political boundaries.
My only question here is; why? I understand that the Mass Department of Revenue can certify neighborhoods during a triennial reassessment. Were precinct boundaries certified as neighborhoods by the state?
Why do the Board of Assessors believe that land in east Arlington increased in value by nine times as much as land values in Jason Heights? Why were four distinct values used in setting land values? What sales data supports these changes?
One note. If you use the interactive map found here, when you zoom in to the fullest extent the label will change to show the house # and a large value, in the $Ms of dollars that is the price per acre for that property (land assessment / lot size) which allows for easy comparisons of different size lots.
Interactive GIS maps
Use the map below to explore Arlington’s 2015 Property Assessments. Or go here for a full screen version.
Sales Do Not Support Changes in Assessments
When asked about the changes in land values, one Assessor told me that sales data supported the changes in land values. We could find no evidence to support this claim. Before we present our evidence to the contrary, a little back ground is necessary.
The FY2015 tax bills are generated from property assessments as of January 1, 2014 based (theoretically) on sales from the calendar year 2013. These sales are submitted to the Mass Department of Revenue, Division of Local Services. Sales are coded for being at an “Arms-length” and included in determining assessments while sales coded for being “non-arms-length” (NAL), such as between a parent and child, are not included in determining assessments since non-arms-length sales are unlikely to be at full and fair market value.
For FY2015, Arlington had 1,039 sales of which 491 were at arms-length. There were 243 one-family and two family sales, 241 condo sales, four commercial property sales and three miscellaneous sales of mixed used properties. The coding of non-arms-length sales, accounting for more than half of all sales in Arlington, should be the subject of an entire post, see two notable examples in the section below.
The 243 one-family and two family sales (SF) in FY2015 can be found on this Google map with a snapshot below.
View Arlington FY2015 Assessments and Sales – SF only in a full screen map
Sales Price Increases Over 2014 Assessment
The default pin category (“>Avg Sales Inc?”) shows whether the sales price increase over the property’s prior (FY2014) assessment is greater than the average percentage change (16%) of all sales in the SF category – red pins are higher than average and blue are lower than average. Other pin options can be found in the drop down menu. Click on any marker at bottom of map to show subcategories. Click on any individual marker to see the sales detail information.
Higher Than Average Sales Evenly Distributed Throughout Arlington
As the snapshot above shows (click to expand to full view), the volume of sales of one family and two family homes was not unusually larger in east Arlington compared to the rest of the town. In addition, the sales that were for much greater (>30%) than the average increase relative to the FY2014 assessment are evenly distributed throughout Arlington with east Arlington seeing fewer than expected as a share of all housing stock. Further, we reach the same conclusion, that sales in east Arlington were no different than sales throughout Arlington, by looking at all sales above the average as seen in this snapshot.
East Arlington Assessments Higher Than Average
Not to lose track of what we are arguing here, the picture on the right (click to expand), shows all 2013 sales of 1&2 family homes in Arlington. Blue markers are properties that saw assessment increases higher than the average assessment increase of other properties that were sold, while red markers are assessment changes lower than the average. Note the stunning fact, all sales in east Arlington saw assessment changes above the average. This should not be surprising since the analysis above on land value changes also clearly demonstrates this fact.
To be complete, we included all condominium sales in Arlington in this google map. Since condos do not have a separate land valuation the sales data does not have much to tell us.
Observations on Sales
1. Many more SF sales outside of East Arlington
2. Toggle to “>Avg Assess Inc?” on dropdown – clearly shows what we know, East Arlington had a higher than average assessment change compared to all sales.
3. Toggle back to “>Avg Sales Inc?” on dropdown and then click the category marker “>30%” at the bottom of map which shows that the highest sales over assessments are scattered throughout Arlington.
4. Condos sales are evenly distributed in “>Avg Sales Inc?” throughout Arlington where condos had sales.
5. Condos in east Arlington has fewer (only 2) red pins in “>Avg Assess Inc?”
Condo sales are included for completeness although condos have no separate land value. I don’t see how condo sales can be used to justify increased land values in East Arlington.
Notable Non-Arms Length Sales
On 12/30/2013, immediately before the FY2015 assessment date for full and fair valuations of real property, 30 Mill Street, the site of the former Brighams’ ice cream plant, sold for more than $50M. The current assessment is for less than $30M, a $20M difference, or more than $250,000 of taxes shifted from a large corporation onto homeowners in east Arlington. The sale is a non-arms-length transaction coded as “B” which, according to the DOR Classification Handbook is:
An intra-corporation sale, e.g. between a corporation and its stockholder, subsidiary, affiliate or another corporation whose stock is in the same ownership
The buyer of the property was US REIF BRIGHAM SQUARE while the seller was SP5 WOOD ALTA MILL STREET LLC. Click on either party to view, among other things, their boards and key employees. These appear to be two completely separate entities.
The other bit of sales legerdemain involves the 250 condos sold by the Wilfert Trust in the Brentwood (60 Pleasant St.) and Old Colony condo complexes.
There are several stories about how the treatment of sales affects the valuation process and the role town officials play in the representation of corporate entities.
Example of Valuation Model Over Fit
Another revealing assessment change is the huge year on year changes in the Colonial Village condo complexes. The assessments on these properties saw a 24% increase this year and a 25% decrease a few years ago, while sales of condos in these buildings were almost double the assessed values. These large changes seen in the Colonial Village condos are symptomatic of a valuation process that over fits the data, especially on properties, like condos that don’t include the leading factor, lot size, in the computerized process. This observation is deserving of its own blog post and involves math that most people would find dull.
Red Flag – Distributions
Looking at the distribution in the percentage change in assessments from 2014 – 2015 for different communities can be enlightening.
In the table below are four such distributions showing the change in residential properties. Click to expand the image. The four images represents the towns of Wakefield, Westford, Lexington and Arlington. The horizontal (x-axis) shows the percent change in the assessed value, while the vertical (y-axis) shows the number of properties (parcels) with that percent change. The vertical line in the center of the chart shows the average residential change. The dark blue columns are condominiums.
Some things to note.
The chart for Wakefield is very easy to understand. The average assessment change was 2.63% with most properties seeing between a 2% and a 4% change in their assessments with a comparable number seeing no change in their assessments. Wakefield’s assessment changes are incremental, uniform and tightly clustered about the average change.
Next look at the chart for Westford. Westford is undergoing rapid development with new construction ongoing that far exceeds Arlington. With a 4.73% increase in assessed values for residential properties, the distribution is skewed somewhat to the left with a long tail representing growth in new homes and improvements. Condominiums, showed as blue columns are somewhat uneven with a larger fraction of condos seeing a smaller than average increase in their assessments.
Lexington performed its triennial reassessment in 2015 with all properties revalued. The average increase in residential assessments was 10.33%, which combined with a 4.6% decrease in the tax rate meant the average residential tax bill increased about 5.7%. One thing to note about Lexington is the symmetric distribution about the average – a “normal” distribution. This is the sign of a robust revaluation process that treats all properties, including condos, equally.
Finally, we show Arlington’s distribution of percent changes in residential assessments. The average is about 5.7%, which combined with a 1.7% decrease in the tax rate, and zero change in commercial and apartment properties, resulted in tax bill increases of about 3.8%. The first thing to note is the disparity between condos and single/multi-family dwellings. Most condos saw a below average assessment change (4.8% Vs 5.7%), in agreement with our observations above. But the real difference is the bump, or extra peak in the distribution around the 10% change. These are the east Arlington residential properties that saw such an outsized increase in their land values for 2015.
When I was a graduate student in Physics at Boston University during the 1990s, I had the opportunity to work in a research group with about 20 other students. For a time, I was the only American in the group. Many of the students were from Eastern Europe; Bulgaria, Slovakia, Hungary and Russia. As well, most of the students in this graduate program at Boston University were from India, China and South America. At that time, Boston University had the second largest foreign student population of any US university and the graduate program in physics is truly an international group at most US universities.
On one month-long trip to China, my travel mates were from Russia, Iran, Portugal, Argentina and South Korea. We were like a little United Nations with the attendant problems crossing borders; the Russian was hassled by the US embassy trying to obtain his re-entry Visa, the Argentine was not allowed to enter Macau, not by the Portuguese, but for re-entry to Hong Kong by the British – a hangover of the Falklands, the Iranian was welcomed by the Chinese, while my passport was met with scrutiny by the cigarette smoking customs guard with the machine gun who welcomed me to a free country on our entry to Xiamen.
To this day, I count people from Iceland, Venezuela, Israel, Japan and from all over the world as my friends.
Generally, during the good weather, the graduate students would climb out of our cold, subterranean laboratories to share lunch together on the plaza in front of the Science Building at 590 Commonwealth Avenue. Even at that time, I couldn’t help but recognize, and somewhat cringe, at the sight 20 or 30 young men (and some women) would make, dressed alike in jeans and flannel shirts, even on warm spring days, while the fashionably dressed young undergraduates would walk by.
Although we represented the entire gamut of ethnic, religious and national identities, what differentiated us most, in our minds, was whether we studied theoretical or experimental physics. What I realized two decades ago, whether I was sharing tea on a beach in China at 2:00am, scaling a 20 foot high wall after the city gates had closed, packed with seven other people into a Trabant driven by a crazy Russian or making a “pilgrimage” with an Indian friend to the local Walden Pond is that I had a stronger bond with these other physicists than I did with the people I had grown up with in lily white, mostly Irish Catholic Arlington. Our bond was not how different we were in appearance, background, economic experience or beliefs, but in our mutual pursuit of science.
That is the lesson I learned. Diversity is not what separates us, diversity is what brings us together.
Greetings! This is a follow-up post to my experience with online courses (see previous posts Spring 2013, MITx 6.00x, MITx 8.MReV and Fall 2013), also known as MOOCs. Since September, I have completed seven online classes and audited another five bringing my total to 15 completed courses and 11 audits over the past year. I decided to overload the number of courses in the fall to find out the limits of what one, somewhat average, middle aged guy could do with online learning.
I sampled a number of courses, but focused on the best offerings from some of the most prestigious universities, not worrying about dropping out of a class, since I am not looking for anything more than knowledge. As well, I’ve reached some conclusions about what knowledge acquisition might be good for besides a ‘certificate’, ‘diploma’ or other intangible accolade.
Below, I describe in gory detail each course completed and briefly describe the courses I “dropped” giving pathetic reasons why. Before I do this, I make some general comments about the two most popular platforms for MOOCs, edX and Coursera. I also describe my overall experience in taking what amounts to more than a full load of college level courses that parallel some of the most popular introductory courses at some of the best universities in the United States and from one in India.
Forgive me for the length of this post. I hope those that read my drivel might attain some understanding of these MOOCs. Mostly, though, I write these thoughts down for myself to record my observations.
Since late August, 2013 I registered for more than a dozen courses on the edX and Coursera MOOC platforms. At this point, I have completed or will complete seven of the courses after doing all of the work associated with each.
In addition, I audited and/or partially completed the following courses.
As detailed below, the courses took about 30 hours/week of work. First, let me respond to all of those out there who complain about how busy they are. I completed all of this work while also running a business, spending as much time as possible with my family and collaborating with others on two new business ventures. This is in addition to the normal social and professional engagements required of an adult. Several times during the fall, someone or other would complain about how busy they were. Some people put in their time; I like to use my time effectively.
This brings me to my second point. Seven college level classes, and audits in a half dozen other classes, far exceeds the normal load of a college student. Part of the ability to do all of this work, and more, lies in the fact that I am much older than your typical college aged kid with a broader and deeper level of education than any undergraduate. Some of which comes from my own formal education, but even more so from having learned far more than a mere 10 years of post-graduate education might provide. One conclusion I have made is that, to paraphrase George Bernard Shaw, education is wasted on the young. I see the potential of MOOCs, in addition to educating university students, in retraining technology workers, keeping retirees mentally active and opening up new areas of knowledge for the non-traditional student.
More to the point, the unique structure of the MOOC made this course load possible. In a residential (normal) college setting, lectures are at a set time and there is always dead time between classes. Not so with a MOOC. During the term, I could be listening to a lecture and pause it while taking a client phone call. I often listened to lectures while waiting at various sporting venues while my children practiced. I could read materials while on the subway directly from my phone, I often would complete assignments, participate in discussions or review online notes very early in the morning or late at night. All this is to say that MOOCs are an efficient education delivery mechanism allowing for incredible productivity gains by the student as well as the teacher.
One last point is the difference between the MOOC platforms. Coursera has a larger selection of courses from many more institutions than edX. However, the courses themselves tend to be easier, the presentation less rigorous and the use of technology (embedded autograders, simulations and interactive problem sets) less impressive than the edX platform. Coursera’s format is somewhat confusing with too many clicks to go between courseware components, while edX’s LMS is crisp and relatively clean, with most features just a single level in depth. Finally, of all of the courses offered, I have to say, in my still limited experience, that MITx has, by far and away, the best implementations with MOOCs that compare favorably with the residential courses offered on campus.
Now for an overview of some of the courses.
Princeton Statistics One
Princeton’s Statistics One (“Stats1”) course, taught by Andrew Conway, is hosted on the Coursera platform. Stats1 is a series of 25 lectures broken into 2, 10-25 long segments each recorded in HD. Andrew stands next to a large screen monitor holding a tablet strapped to his hand controlling powerpoint slides on the monitor that he annotates during the lecture. Professor Conway talks directly into the camera and there is no class present. Princeton does not offer a certificate for this course.
The class covers four broad categories of introductory statistics:
- Research Methods and Descriptive Statistics
- Simple and Multiple Regression
- Group Comparisons using T-tests and ANOVA
- Non-normal Distributions and non-Linear Models
There were 10 lab tutorials showing how to use the R-programming language to do statistical analysis. There were 11 homework assignments that reinforced the lecture concepts as well as two exams, a mid-term and a final. I spent approximately 2 hours a week on this class and scored a 90% overall.
The course used examples from IQ testing/memory training studies, concussion studies using the IMPACT dataset that many high schools now use and other real world datasets. There were a couple of contrived examples, which I would urge Conway to replace with more meaningful data. The concussion studies were particularly interesting, showing a link between pre and post test results for athletes suffering head injuries.
I’ve never taken a dedicated statistics course, and although little of the material was new to me, it was informative to have a clear, concise and detailed exposition of the subject delivered in a comprehensive manner. Overall, if you are interested in understanding statistics, or want to solidify your R statistical programming skills, I can recommend taking Conway’s Statistics One course.
MITx 7.00x is an introduction to biology required of all MIT students. The professor is Eric Lander, a well known, accomplished scientist and professor. I found 7.00x well designed with an excellent sequence of lectures that are mind expanding. The questions throughout the course kept me on track, helped solidify my understanding of the lecture materials as well as pushing me to learn certain aspects of biology on my own.
The use of the tools in 7.00x, like the molecular editor, jsMol a 3-D macromolecule viewer, geneX, IGV an interactive gene viewer and the virtual genomics lab VGL, was perfectly coordinated with the lectures. I’m one of the lucky participants with a high speed internet, monster computing device and multiple screens. Having multiple screens really came in handy when answering the problem sets; opening resource box PNGs, the instructions and the answer section all on different screens. I’m sure the clever edX developers will figure out a way to make this less important in the future.
The week 7 lectures and problem sets were just brilliant. I showed my middle school children the connection between DNA to RNA to protein using the visual representation made possible by the jsMol tool in the problem sets which allowed them to follow the conclusions presented in the lecture materials. Their response was ‘cool’. This is a testament to the efficacy of the tools used in 7.00x; that portions of the course are entirely accessible to most anyone.
In the lectures, the discussion of how sickle cell anemia changes the morphology of the red blood cell, and the molecular mechanism of hydrophobic binding forming long chains and resulting from a single base pair change in one chromosome was the high point of 7.00x for me. Adding in the discussion of all the greek (beta, delta, gamma, etc.) -globins, the connection to thalassemia, fetal oxygenation processes and the introduction to evolutionary genetics was just pure brilliance.
The last few lectures covered the molecular biology of both heart disease and certain cancers employing all of the genetics, biochemistry and other materials presented throughout the course. Again, the discussion was clear and my understanding of rational medicine improved by many orders of magnitude in 14 short weeks.
7.00x consisted of 27 hours of lectures and help sessions (although at 1.5X speed, it is more like 18 hours). There were 176 or so interstitial quizzes requiring about 3 hours of work (@ 1 minute each), 15 hours of reading (I admit, I skimped here!), 7 problems sets at 3-5 hours each for 30 hours and 3 exams at 4 hours each for 12 hours.
There were approximately 530 questions (counting green checks and the dreaded red X’s) in the problem sets and approximately 157 questions in the three exams. Those who completed all of the work in 7.00x answered well over 850 biology questions. In the beginning, I dreaded the single try exam questions, but in the end appreciated the necessity to up my game and not just answer with my usual, lazy, first guess.
7.00x consumed about 100 hours of my time, which over a 14 week period, was about 7 hours per week. An investment that was well worth the effort. I’ve estimated about 10 hours participating/reading the discussion groups, which may be on the low side. I also left out the huge number of tangents, self guided studies prompted by other students’ observations and discussions with my friends and family about the material I was excited about and they patiently listened to. I garnered an 89% in this course.
In conclusion, 7.00x is a rigorous, well designed, fantastic, interesting, mind blowing introduction to biology. I want to thank Professor Lander, all of the TAs and course staff, MIT, edX and the Broad Institute for providing me with this experience. For me, learning is the true secret to a well lived life and Eric Lander’s 7.00x should not be a secret for anyone who wishes to learn.
Some notable 7.00x links
A virus simulation highlighted by Professor Lander:
Anyone who has taken a course in physics knows, by far the most difficult courses offered in either an online setting or a residential setting is a course in physics. While I might have spent a few minutes, or even half an hour on a particular biology problem, there were times that I struggled for *days* solving a single problem in 8.01x. The instructor, Walter Lewin, is a great lecturer combining an entertaining pedagogical style with unforgettable in-classroom experiments and a genuine passion for teaching physics. 8.01x is a rigorous and challenging study of introductory physics and the lectures and problem sets should replace every high school AP physics lecture course.
Below is the introductory video for another class by Walter Lewin (8.02X – Electricity and Magnetism). However, I think it is also a great introduction to this course.
I kept (rough) track of the time required for each component of 8.01x. There were 36 hours of lectures and help sessions; although at 1.5X speed, it is more like 24 hours. As an aside, I can no longer watch lecture videos at normal speed, nor tolerate lecturers who don’t have the stage presence to view well at the 1.5x speed, like Walter Lewin does or as Eric Lander, who teaches 7.00x, does. There are 253 or so interstitial quizzes for about 4 hours (@ 1 minute each), 10 hours of textbook reading (I admit, I skimped here!), 10 problems sets at 3-5 hours each for 40 hours and 4 exams at 5 hours each (assuming a tough final) for 20 hours. 8.01x will consume about 100 hours of my time, which over a 15 week period, is about 6-7 hours per week. I achieved a grade of 96% in 8.01x.
I’m not sure if I am on the low side here for the general student population of 8.01x and I may be off in my own estimation.
In this accounting of time spent, I’m completely ignoring time spent in the discussion group. I was at the low end of usage with 40 comments or so, while others posted almost 3,000, extremely useful comments. As well, I am not including the “extras” that many students do, such as the 300+ pages of LaTex formatted notes compiled by one very diligent TA or the research into tricky homework problems. When one considers that the average tutor charges about $65/hr for help in AP physics, I’d estimate that community TAs provided hundreds of thousands of dollars worth of free tutoring.
So what is essential to the student of physics? You may say all of this and more (labs, in class discussions, one on one office hours, peer study groups, etc.), and you would be correct, but as a minimum, the lectures and quizzes are, as Professor Lewin might say, non-negotiable. More to the point, the lectures are so well done, with in-class experiments, thoroughly entertaining, concise yet rigorous, that I believe that this set of 8.01x lectures represent, in some very real sense, a platonic ideal of what an introductory course in classical mechanics should be. More to the point, the lectures themselves are accessible to a wide range of students; including high school students taking AP physics. At 3-4 hours a week, one could imagine a future in which all lectures in classical mechanics are an evolved or derived form of 8.01x. Furthermore, because these lectures are so accessible, a bright future might include a wider audience learning the basics of elementary physics with a relatively modest time commitment.
But the larger time and commitment sink, which solidifies a large fraction of understanding the material, are the problem sets and the assessment/evaluation process (exams). First, the commitment required to complete the problem sets is greater than the commitment to watch the lectures and try the quizzes, at least from my own experience.
One might imagine a hierarchy of problem sets. At the lowest level, the casual student might be asked to simply regurgitate the lessons learned in lectures with simple modifications; various iterations of Atwood machines or Capstans. These assignments might allow for infinite guesses until the student displayed mastery (much as the current system). At this level, and also experienced in this course, small variations to worked examples in the text would also be appropriate.
Imagine a society with a larger fraction of advanced high school students and first year college students displaying a mastery of just the lecture materials. IMHO, this would be beneficial.
At the next level, more difficult twists on the worked examples, perhaps with less allowed attempts, would allow an assessment of students able to go beyond and generalize the presented materials. Ratcheting up the assessments to the level seen in the current 8.01x, with related material or new material in problem sets and exams, and limited allowed attempts on exams, allows for differentiated assessments. Finally, the most excellent problems, such as the leaning ruler, which is intended to provide the ‘ah ha’ moments that signal an increase in intuition, a firm grasp of the concepts and the teaching of the “elegant” solution, with perhaps a single try, is the highest level of traditional instruction.
That is what I love about the MITx courses, the limited try, classic problems that you know have an elegant solution and that ‘ah ha’ moment of solving. This is how budding scientists are identified and trained today, at least in the classroom setting.
Now imagine a MOOC with dozens or even hundreds of examples and variations on each of these problems found in the homework sets and exams that allow for the student to be led to increasingly difficult assessments for each concept covered in the lectures. Imagine a tailored courseware that leads the student to push their own ability. I believe this will be how you can produce budding scientists.
All of this is to say, the current edX implementation of 8.01x is surely evolving. What we see now is just one small slice of a continuum of assessment tools (i.e. problems sets/exams) with a rather elementary way of setting a reasonable threshold, number of attempts, to gradually explore and refine the education system of the future. I’m happy that the course designers are tinkering with one lever, the allowed attempts, hoping that their vision of edX leads future students to better outcomes.
I read an interesting comment about MOOCs and education in general in a news article recently that is germane, although you may find my analogy long and torturous to follow.
One reader contrasted the ‘military’ model of training where no one was allowed to fail to pass certain tests (after the boot camp winnowing process) and that enlistees were required to continue taking a course of study, and continually tested, until they passed, usually with a very high proficiency rating.
What fascinates me about this insight is that is the ideal elementary and secondary school education model, where no student is allowed to progress until showing proficiency in a subject. Unfortunately, the current labor intensive education model makes this ideal almost impossible to achieve in your average public school, here in the US.
One of the first, modern MOOCs; Stanford Professor Andrew Ng does a great job with a difficult subject. Ng is a cofounder of Coursera and his Machine Learning course was offered for the fourth time. There were a couple of innovations for this course. Previous students offered tutoring services with prices ranging from *Free* to $30 for half an hour through the google helpouts, a service which provides video chat. This is an interesting experiment towards producing some kind of revenue stream for a MOOC.
Andrew Ng’s Machine Learning is a well designed course covering many of the commonly used algorithms used in, well, machine learning. The course consisted of 18 lectures in ten weeks broken into 5-10 segments, each about 10 minutes long, for a total of just under 20 hours of lectures, or 16 hours at 1.25X speed. Between each segment there was a multiple choice question covering the material just learned, for a total of 112 questions (2 hours @1 minute each). There were 90 review questions for 3 hours @2 minutes each. As well, there were 9 programming assignments, each taking 2-3 hours, for a total of about 25 hours. The total time commitment for the course was 46 hours, or about 5 hours per week over the 10 week course.
I did not participate in the online discussion groups, maybe reading a total of 10 posts (looking for resolution to an octave installation issue) and posting nothing myself. I did no outside reading, nor did I follow many tangents from the class. I achieved a 100% overall for the course.
The course covered a number of algorithms in Machine Learning; linear and logistic regression algorithms that minimized cost functions using gradient descent methods. Next we covered neural networks, training a neural network to identify handwriting solving a classic problem in recognizing zipcodes for post office sorting. This assignment was pretty cool using real data. The algorithm was about 99% accurate and the handwriting samples that it failed to classify were difficult even for the human operator, see below.
After neural networks, we covered algorithms in support vector machines (SVMs), clustering using K-means testing and principal components analysis (PCA). Examples and problem sets covered topics in autonomous driving, image compression, email/spam processing and machine vision. The course wrapped up with anomaly detection systems, recommender (Amazon, Netflix) algorithms and large scale datasets.
I actually registered for three similar courses; Caltech’s Machine Learning and Big Data and Web Intelligence from the India Institute of Technology. The CalTech one was over my head, while the IIT one was too easy, like a survey course. The Stanford course was the Goldilocks’ choice, just right. I stopped doing the CalTech course, but completed the IIT one. Perhaps now, I’ll revisit the CalTech offering armed with greater understanding of the material.
I think that is one of huge advantages of online learning; multiple options for lectures, assignments and resources. Some of my MOOC buddies loved the CalTech course, so there is no accounting for taste, but there is enormous advantages to the smorgasbord approach to mastering a subject. During the course, I bookmarked a dozen different ML courses, from MIT’s OCW archive to complete programs at other universities.
I took Stanford’s ML course and the other MOOCs with an eye towards the barriers to learning a young student might face. In particular the three most difficult hurdles I recognize for the ML course were mathematical formalism, linear algebra and vectorized programming.
Andrew Ng, the professor, frequently acknowledged the potential difficulties of the formalism throughout the course, but to his credit he maintained a good balance between rigor, mathematical sophistication and accessibility. I observed a similar, stated difficulty when I took the Berkeley computer graphics course. For a physicist used to four vectors, alternative coordinate frames and Einstein indices/Kronecker deltas, the formalism is trivial, but easily daunting to those less familiar. Ng’s course was well done; with excellent consistency in using his indices for summations – ‘i’ and ‘m’ for training sets, ‘j’ and ‘n’ for features. That said, a few auxiliary slides with visual representations of the components of the resulting cost functions would be a nice addition.
The second hurdle is linear algebra. I might be mistaken, but most students are not exposed to LA until after the first year of college. I think linear algebra should be a foundational subject for budding computer science students. There is no reason that a formal course needs to be delayed so long in the typical sequence; Khan Academy has excellent tutorials and Ng had an introductory lecture on the necessary material. I think we focus on getting programming courses into high schools when a well done, highly motivational, and blessedly short intro to linear algebra with specific examples used in computer science might be more useful to students.
Which brings me to vectorized coding. One minor satisfaction in the ML course was taking one of the complicated, double or triple summation cost functions and implementing as one, compact line of octave code. Introducing loops is typical for intro computer science classes, it would be beneficial if students were then shown loop avoidance and other vectorized programming techniques early on. I was unhappy with having to succumb and use octave (matlab) for the course. However, after completing ML, I can appreciate Ng’s insistence on using octave. The complicated optimization formula, with double summations, above can be reduced to one line of code using octave:
J = 1/2*sum(sum((((X*Theta') .* R)-(Y .* R)) .^ 2)) +
(lambda/2) * (sum(sum((Theta .^ 2))) + sum(sum((X .^ 2))));
Again, I see no reason, as well as plenty of benefits, of introducing vector coding very early on in a computer science curriculum. Back in the day, I did a lot of scientific programming in the early models of the Connection Machine and there was a certain elegance and intellectual satisfaction in writing code that reduced to a single line directly recognizable as the textbook formula.
Sorry for the long note, but they boil down to multiple, parallel course materials are useful and have only incremental cost in a MOOC model and three foundational mathematical courselets of study (index notation, linear algebra and vector programming) need earlier introduction (high school) in a computer science curriculum.
John Hopkins – Data Analysis
Very intensive, short great way to learn R. Highly recommend. 4 weeks long, 4 challenging problem sets. Excellent coverage of R graphics packages and the visual display of quantitative data. I achieved a 99% in this class.
HarvardX – MCB80.1x Neuroscience
IIT Delhi – Web Intelligence and Big Data
Survey course, great intro to all sorts of subjects. MapReduce, Haddop, noSQL, on and on. Lost interest about 2/3 of the way through and with the start-up of Stanford’s class, I did not watch the last third of the course lectures, although I did moderately well, all things considered, on the final quizzes, homeworks and final exam earning a 71% in the class and a certificate suitable for framing ;->
CalTechX – CS1156x Learning from Data (Machine Learning)
Too hard for me, switched to Stanford’s ML course instead.
Rice – Interactive Python Programming
Nothing I can’t learn on my own. Did not appreciate the twice weekly assignment due dates.
BerkeleyX – CS-191x Quantum Mechanics and Quantum Computation
Excellent course, challenging, rigorous. Decided to drop it and re-register at a later date.
MITx – 2.03x Dynamics
I had to drop this course since I was pretty much loaded up by the time it started.
I got a lot of feedback from the last few posts describing my experience with some of the MOOCs. Today, I do the worst possible thing and discuss my plans for fall courses. At this point, I have completed seven courses; three from edX, three from Coursera and one from Udacity while ‘auditing’ another four. This fall, I plan on taking/auditing a bunch more.
MITx 7.00x is an Introduction to Biology. The lead instructor is Eric Lander, a well known, accomplished scientist and professor. I am looking forward to this course since I know the incredible strides microbiology, biochemistry and genetics research have made in the past few decades.
The last time I took a biology class was almost 40 years ago as a high school freshman. I vividly recall the first day of class when the teacher asked us students to write down one question, any question, no matter how stupid or silly, that related to biology. Having been a regular Fidelity House Day Camp tripper, I asked whether the practice of camp counselors cutting open frogs harmed the frogs; the counselors’ assertions to the contrary not withstanding. My high school biology teacher was shocked and proceeded to vilify my apparent, incurable stupidity for the rest of the class.
While this experience was only a part of why I was turned off to biology, I was able to hold the subject with a disdain that was only later reserved for other pseudo-science endeavors such as economics, sociology and psychology. Over the years, I did come into contact with the biological sciences, but managed to hold my prejudice intact. I was awarded several NIH grants to continue my graduate school research. I did a short gig providing a small hand in setting up a computational bioinformatics lab for a first wave (1980s) biotech company. I was an early user of the PDB finding convincing images of the floppy hinge of mutated Apolipoprotein E4 as a possible physical mechanism for plague formation in Alzheimer’s patients and later presented in a seminal Ether Dome talk on the subject.
This is all to say, that biology, in the 20th century, was not considered a hard science to the snobby intellectual. Time to shed that prejudice! Although 7.00x does not begin until next week, the reading list was distributed and all indications are that biology has come a long way in 40 years. The first reading assignment covers the chemical components of a cell beginning with atoms and building up, block by block, using firmly established chemical and physical principals, the macromolecules responsible for molecular biology.
I suspect that 7.00x may do more than remove my long held prejudiced occasioned by, what I might claim, as rough handling in my formative years. I expect that 7.00x will kick my intellectual ass.
MITx 8.01x is the ‘classic’ Classical Mechanics course taught to MIT freshman. I have probably ‘cheated’ a bit first by having already seen all of the course material in my formal education, but also by participating in the 8.MReV review course this summer. That said, I am looking forward to listening to the lectures by Walter Lewin. I explain this by analogy. In the 1960s, one of the greatest physicists of the 20th century, Richard Feynman, taught the introductory freshman physics course at CalTech. Lore has it that the students were utterly and hopeless lost and many stopped attending; only to be replaced by an ever increasing number of graduate students, physicists and the merely curious. The end result was the compilation of the Feynman Lectures in Physics, an orthogonal, entertaining and unique resource for understanding physics. Cool link here http://www.feynmanlectures.info/
This Coursera class is interesting but perhaps not very rigorous. I’ll report more on this later.
I want to see how Princeton implements an online class. Plus, everyone loves a good story!
Hello again! Today, I will relate my experience in taking my seventh MOOC and fourth on the edX platform. First, a little background. I had no intention of taking this course since I have never, ever taken a course during the summer which I generally reserve for necessary business and vacation time with my family. I’ll admit, this course came in third, or maybe even fourth place in terms of priorities. That said, it was a worthwhile class and I highly recommend it.
8.MReV is a college-level introductory mechanics class using a strategic problem-solving approach and is designed for teachers of college or Advanced Placement physics. The students ranged in age from 14 to 80 and I heard estimates of about 10,000 participants. I am not sure how many completed the course. In addition to teachers of physics, there were many current students and life long learners, judging by the introductions in the discussion forum.
The course ran from about mid-June through the end of August with some optional material that runs through the middle of September, offering a condensed schedule covering a semester’s worth of material. Over the graded, 11-week course, there were approximately 40 video lectures, a dozen or so interactive java applets, about 100 pages of notes and almost 1000 graded problems on various topics in classical mechanics. The course followed the SIMs (system – interactions – model) framework developed by Dr. David Pritchard for a uniform approach to problem solving. The course did not attempt to teach the concepts presented, assuming some level of familiarity already, instead it was designed to teach proficiency in problem solving.
My decision to take this class was made at the last minute; unfortunately I was away at the start of the course and twice during the summer. While I do have a formal background in physics, I had not studied this material since my high school AP course more than 35 years ago. I did register for the usual sequenced classical mechanics course MITx 8.01x which starts in September and rationalized that any work I put into the 8.MReV class would be helpful. I was in for a surprise.
The coursework consisted of about 400 graded checkpoints scattered as exercises throughout the lectures and notes. Generally, these were true/false questions, multiple choice and a fair number of formulaic solutions. Unlike many other courses, the number of allowed guesses were limited to one for true/false (doh), two or three chances for multiple choice questions and between two and ten guesses for questions requiring a formula answer. Answers to the exercises were not provided until after the due date which was sub-optimal for learning.
In addition to the 400 checkpoint exercises, there were approximately 300 home work problems; many of these problems were very hard. In my opinion, this is where the course really shined. Some of the best problems in classical mechanics were contained in the homework sets. I had forgotten the beauty of finding an elegant solution to a challenging problem, more about this below. There were also 8 quizzes with an additional 160 problems. Generally, the quizzes were much easier than the homework sets although you were given far fewer tries to get the correct answer. All together there were about 1000 problems in this course, about 100 due every Sunday. I spent an average of 6-8 hours per week on this course and certainly not enough time on the homework sets.
The material covered Newton’s Laws, equations of motion, kinematics, mechanical energy and work, dynamics, linear momentum, impulse and inertia, torque and rotation, rotational energy, angular momentum, orbital mechanics and harmonic oscillations. Problem sets included examples using systems of blocks and pulleys, Atwood machines, pendulums, springs, inclined planes, friction, projectiles, collisions, ladders, rolling, slipping, skidding balls, yo-yos, tether balls, merry-go-rounds and many other simple systems.
One of the highlights for me was the lecture by Walter Lewin who showed that rolling dynamics were independent of mass and extent and depended solely on the geometric properties of the object. Check it out.
Problem Solving Skills
The most useful components of 8.MReV, in my humble opinion, are the 1000 problems. I know what some of you are thinking, who in their right mind would waste about 100 hours on introductory physics problems during the summer? The more practical among readers of this post might ask, what possible value could a middle aged guy get from a course like this? Sure, if you are a teacher or student of physics, then maybe one could understand taking a course like 8.MReV (or 8.01x, etc.) but what about the remaining 99.993% of the population?
I believe there are three responses, each revealing a deeper layer of understanding of why you might consider such a course. First, no matter what your age, using your mind in a constructive way is a good thing. Studies have shown that mental health and quality of life are all improved by exercising your mind. Life learners know that understanding brings a kind of happiness not found elsewhere.
Second, 8.MReV covers topics that every person should have some grasp of. Recently, I discussed quadracopters with a non-native English speaker. Since we both had a basic understanding of classical mechanics the depth of our discussion was greatly enhanced. More importantly, developing robust problem solving skills is immediately applicable to practical problems in just about any field. For example, some of us have seen elegant programming implementations, but generally we see confused, coded solutions that obscure and are prone to errors. Becoming proficient in finding the elegant solution to a problem that is analytically tractable transfers to real world problems that may have no analytic solution.
Most importantly, problem solving is a key skill of a well educated, 21st century knowledge worker. Anyone can be taught how to grind out a solution using a rote method. However, many problems faced outside of an academic environment require independent thought. 8.MReV, in the tradition of many physics courses, uses problem sets to nurture this ability. The most effective approach to develop this skill is in solving real world problems where you can touch and see the system (a yo-yo, a tether ball, a merry-go-round) and build a sensory intuition along with the analytic ability to describe accurately.
I included a sample of four problems below to illustrate the type of problems found in 8.MReV. One neat aspect of the course is that the level of mathematical sophistication was actually quite low; algebra, basic trigonometry and calculus in some of the derivations, but only once or twice in any of the problems. In fact, most of the problems could be solved with simple algebra. The beauty of the problems illustrated below is that it is not the mathematics that is challenging, but understanding the physics of the problem. Many of the problems in 8.MReV could be solved without using any math at all.
One way to approach the problem below is to consider the possible, extreme values the variables might take on. What would the distance be if the bike was traveling at an incredibly high rate of speed? Alternatively, what would the distance be if the car accelerated very slowly? By looking at these two extremes, the solution can be found for the typical case without any math at all.
A car is stopped at an intersection with a red light, and a biker (in a bike lane) with velocity “v” is approaching the car from behind. When the biker is a distance “d” from the intersection, the light turns green, and the car begins to accelerate at a constant acceleration “a”. In terms of only “d” (not “v” and “a”) calculate the distance at which the biker catches the car if he only barely catches it while it is accelerating.
This trick, checking the cases where the variables take on extreme values, is handy in almost any field. I have used this trick frequently in solving problems in finance, computing and many other fields.
The solution to the problem below is not something you might guess, but again the mathematics is quite simple, while an understanding of the words used, “average” and “horizontal”, provides the solution. Frequently, the most challenging part of solving a problem is using the correct language, in a precise way, when describing the problem.
Two soccer players kick a ball back and forth toward each other. They start off 50 meters apart, and walk toward each other with equal speeds. They kick the ball continuously until they meet in the middle one minute later. What is the magnitude of the ball’s average horizontal velocity in meters per second?
The problem below has a simple result that you can discover this winter on Spy Pond.
You stand at the end of a long board of length L. The board rests on a frictionless frozen surface of a pond. You want to jump to the opposite end of the board. What is the minimum take-off speed v measured with respect to the pond that would allow you to accomplish that? The board and you have the same mass m.
This last problem has an elegant solution that you discover yourself this summer, although it might be easier for you to be in the boat and not your dog.
A dog sits on the left end of a boat of length L that is initially adjacent to a dock. The dog then runs toward the dock, but stops at the end of the boat. If the boat is H times heavier than the dog, how close does the dog get to the dock? Ignore any drag force from the water.
I finished the graded portion of 8.MReV on Friday, although I intend to give the optional sections a try. I scored 96% on the checkpoint exercises, 92% on the quizzes and an 86% on the homework for a final score of 88%; the homework accounted for most of the grade. In conclusion, I highly recommend this course for every high school AP physics teacher, aspiring students of any science and those who teach introductory mechanics at any level. This course would also be useful for graduate students in physics studying for their qualifying exam to review the material and challenging problems that typically are found there. However, I think that those who would benefit the most from 8.MReV would be anyone who needs to solve problems in their day to day life, which is to say everyone.
Not sure if anyone cares, but here is an update for the online courses I started back in February, original post here. Since that time, the BerkeleyX Computer Graphics course, 184.1x, has ended and two others, MITx’s Introduction to Computer Science, 6.00x, and Washington University’s Computational Finance and Financial Econometrics are coming to a close. HarvardX’s Greek Hero class is still ongoing, but as in my first post, this is my most neglected course.
Today, I’ll relate some of my experiences with the MIT course, which I found to be the best implemented of all four courses as well as the most valuable learning experience. Hopefully, now that my workload has dropped, I’ll write-up my experience in the Berkeley course soon.
The first third of 6.00x was basic python programming and some classic topics in computer science; search and sort algorithms, recursion, orders of complexity, topics in debugging, that sort of thing. The lecturer, Eric Grimson, is one of these super smart people who can introduce really neat concepts in a simple pedagogical style. The course material started easily enough with examples drawn from mathematics; factorials and Fibonacci sequences to introduce recursion, root finding problems for divide and conquer algorithms, etc. At one point, we were doing a simple algorithm to find the greatest common denominator (GCD), which my fourth grader had been learning in elementary school, when Grimson made the connection with Euclid who some might claim created the first computer algorithm and went further to show the connection in discovering primes for data encryption techniques. Although familiar with all of these topics (I worked with a friend on developing RSA encryption of bank cards decades ago), Grimson’s genius was tying them all together in a simple exposition of basic programming.
Besides algorithms in math, we had problem sets in a couple of simple text parsing programs including word guessing games and Hangman, as well as programming solutions to other games like the Towers of Hanoi and data encryption. All fun stuff. The course was relatively straightforward and then came a mid-term exam and my first test in years.
The first 6.00x midterm exam was available from Thursday, March 21 to Monday March 25. The test was structured for 90 minutes, but students were allowed 12 hours to complete, since many students come from regions with slow internet connections and questionable access to power. I took the test on Friday, March 22, starting at 7:00am. I had to stop after the first half hour to drive to work and then again from 8:30-9:30 to take care of some business. I did finish in 4 hours, but spent significantly less than half that time working on the test. I only used the reference material provided as well as the python IDE (IDLE) and shell. I got a 92 on the exam, screwing up some True/False questions, which made me feel good, but my happiness was short-lived.
Immediately after the test, I became a bit complacent and worked on some other courses, life issues and business. Frankly I neglected 6:00x for a week. Big mistake! The lecturer in the videos changed, which was a bit disconcerting, but more importantly, we hit an inflection point in the course with the introduction to object oriented programming. I mentioned in my last post that I never took a computer science class, but have some experience in programming. Sure, I have seen and modified plenty of object oriented code, but never took the time to actually learn the basics. Well, let me tell you, the second segment of 6.00x was completely different from the first set of lectures. The next problem set, that had a two week deadline straddling the exam period, was friggin’ hard testing my basic understanding of classes, methods and inheritance. The problem was to write an RSS news feed parser (RIP Aaron Schwartz). I worked hard on this problem set and finished it over three days. Ever experience the feeling when you struggle with a set of new concepts and you just don’t get it, and then one day, bang! you understand it? Well, I guess problem set 6 in 6.00x was this “birthing” pain I had signed up for.
The next few lectures solidified my new found understanding of OOP with an introduction to some simple graphing (pylab) and a cool problem set to simulate an iRobot Roomba cleaning machine; my kids loved watching the simulation of hundreds of simulated Roombas “cleaning” a 1000×1000 grid.
The course then jumped into some concepts in statistics and probability to set us up for a problem set involving a virus/drug simulation trial which was pretty cool. At some point, life intervened and I didn’t finish this problem set, but did enough work so that I finally mastered some basic OOP concepts. I was lucky to have a good understanding of stats, so the coin flipping examples and card drawing exercises were easy and I could focus on the coding. I finished the ninth week doing curve fitting with stochastic simulations of drug treatment plans, building on the virus simulations of the previous weeks. Fortunately, the course provided robust (executable only!) solutions of the problem set I had failed to complete. Whew. This is one shortcoming in other on (and off) line courses that build on previous lectures; if you miss a concept you cannot complete the course. I found that the 6.00x staff and professors handled this issue quite well.
This brought us to the second mid-term exam. Same rules as before, but I was much better prepared. I took my time and had to deal with a bunch of unrelated stuff, but finished the exam and scored around the same as before. I looked at the discussion forum after the exam ended. There had been various discussion on the number of students enrolled (I had seen guesstimates from 20,000 to 60,000). Apparently, the number had dropped significantly after the first exam and even more so after the second exam. I would love to see the actual enrollment and completion statistics.
Generally, I have avoided the discussion groups, which can provide valuable hints to problem sets, but were also weighted down by complaints and demands to staff for help in overcoming hurdles. Frankly, problem solving is the whole point of most academic courses of study. Reading some of the discussions well after the fact brought smiles since a lot of people struggled with the same stuff that I had. Misery does love company!
At this point, with one problem set left, a couple of lecture series and the final exam, I have already passed the course. I intend to stick it out and try for a good final grade, but I have already accomplished what I wanted and feel good about 6.00x in particular. I have nothing but praise for the class, the set-up was easily the most impressive of the four courses I have taken so far. I really appreciated the work the 6.00x staff put into the auto-grader, even when it appeared to make mistakes :–()!! but most of all I thank MIT and edX for making this intellectual exercise possible.
I finished the MITx 6.00x on Friday. The last set of lectures were really interesting covering topics in graph theory, dynamic programming and a series of guest lectures by researchers. One in particular, column oriented database design was of great relevance since that is how I have stored my data, in what I call case series, for the past 10 years or so. I’ll have to look into the c-store page to see how a group of smart guys implemented this idea.
The final exam started on Thursday. I’m somewhat disappointed at my performance on the final, mostly because I attempted it on Friday while work and other concerns occupied my mind. That said, I’m ok with my overall performance, getting a final grade of 92 for the course. I’ve already signed up for half a dozen more classes and look forward to finding other courses nearly as good as 6.00x!
I am a new contributor to the TruePersons blog and I want to introduce myself. I am a middle-aged white family man, not unlike many people in Arlington. I work for a small company in a secure job that pays me quite well but is not intellectually stimulating. I have involved myself in the Arlington community for decades; helping out in my kids’ schools, with their sports teams, with a variety of non-profits and charities, with our church and through community programs sponsored by my employer. I chose to remain anonymous on this blog for a complex set of reasons that I am not going to discuss here. If that hurts my credibility with the readers, so be it. I provided a little bit of background as if to say, I’m pretty much an average kind of guy.
Recently, I decided I wanted to learn some new things and signed up for several online courses. The topic of my contributions to the TruePersons blog will be my experiences in taking these classes and the tangents they take me on. I hope that some will find what I write informative. Also, I want to share some of the cool things I hope to learn. My blog posts might be a bit too technical for some, but again, so be it. For anyone who chooses to read my drivel, I hope you find it both interesting and motivating.
There are many different online courses being offered through different universities. For the most part, these courses are aimed at undergraduate level education and cover mostly science, math and engineering courses although this mix is changing rapidly.
I’m not sure of the history of these courses, but I want to share some of my observations. First, there is the MIT Open Courseware where over 2000 MIT courses with class notes, problem sets and video lectures have experienced 125,000,000 visits over the past decade. I believe this experience led to MIT and Harvard to start a new venture called edX that offers fully sequenced online courses with video lectures, graded exercises, homework, discussion groups and even meetups. edX launched with an MIT classic class in electronics that saw over 160,000 registered participants. Classes are available around the same fall/spring semesters as the physical university, but with delayed starting times. The courses are self-paced, meaning you can listen to the lectures at anytime, but the few classes I’ve seen so far do have home due dates as well as scheduled tests.
The best thing about these courses are that they are free.
I chose three courses; one from MITx titled Introduction to Programming which I will refer to as CompSci, the HarvardX course titled the Ancient Greek Hero and a class from Berkeley, Foundations of Computer Graphics, which I’ll call CompGraphics. Although I am somewhat of a renaissance geek and feel comfortable changing programs in a dozen languages, I never have taken an actual course in a programming language, so I figured the Intro course that focused on python and the computer graphics course would not be beyond me. Maybe, with a family, job and other commitments, I might have bitten off more than I can chew, we shall see.
I’ll also admit I am fortunate to have had a classical education, like 35 years ago, so I am not unfamiliar with Greek mythology and literature, but I haven’t thought about this stuff in decades and believe a refresher might be stimulating. In addition, the programming courses are offered by MIT and Berkeley, while the greek hero class is offered by Harvard. I am interested in experiencing how these three institutions implemented the courses. Also, I am interested in how a humanities course would be implemented in a MOOC.
Today, I stumbled across another MOOC platform, Coursera, which offers thousands of college courses from great universities like Stanford, Princeton and many others. Unable to resist, I signed up for an Introduction to Computational Finance and Financial Econometrics which I’ll just refer to as Finance. I have professional experience with some of the topics covered in the syllabus and some experience with the R-programming language used throughout the course, but again, I have never had a formal course in the materials and thought a systematic review might be informative. Coursera seems to have a different business model than edX as well as a completely different platform for the online course work. This was part of the reason I decided to take part in yet another class; I’d like experience from different platforms to really evaluate, compare and contrast the different styles.
One noticeable difference is that the lectures on Coursera, or at least the one I am taking, are video taped from classroom lectures. So smart boards, lecturer at the from of the classroom are all “live” which is different than the edX “pre-recorded” lectures. I actually prefer the edX presentation, at least so far. Also, the Coursera platform offered me a “certificate of completion” for $3600. Interesting! As a money maker, one might imagine that a normal course load of 4 undergraduate courses per semester would translate into almost $30,000 for the university with little marginal cost to adding new students.
The edX classes are also incredibly well subscribed, with over 20,000 students in the CompSci class and 27,000 in the Greek Heroes class.
I had a busy weekend with family and did not use my computer until Monday am. Holy crap, over 4 hours of new lectures, homework in all four classes and I haven’t done a damn thing in the HarvardX Greek Heroes course. Worse, the new Berkeley class on Computer Graphics has started up. On top of all this, the MITx course in CompSci is having the first test starting on 3/21. I got ahead in this class, but now I haven’t thought about it for more than a week, need to finish the exercises for the damnable complexity lectures to figure out the one problem I haven’t fully understood and it probably makes sense to review the course work so far before taking the exam. Of course, the next set of lectures and homework assignment, not on the test, have been made available.
Business is also putting demands on me, with our largest client emailing me a project before 8am, a retail client setting up a meeting for potential referral business and the urgency to make some decisions for a new client on top of the daily routine. Arrrg, what have I done to myself!
Ok, deep breath, one thing at a time. First, I took care of business. Then I started the new course, the BerkeleyX computer Graphics class. I watched the video lectures, not too bad at 1.5X speed. One of the cool things about the embedded video lectures is you can change the speed; from normal (1.0x) to 1.5x on edX to 2.0x on Coursea. One of the lecturers is pretty boring, so the increased speed is a time saver. With the follow along transcript, I find it easiest to read what is being said and hear the lecturer at a slow reading pace, but still fast listening pace.
The introduction and first set of lectures in the CompGraphics course were easy. The review of linear algebra was elementary, although the lecture exercises were designed to be tricky. The first homework was to compile a C++ framework that was provided to insure that us students had a compatible development environment. I decided I was not going to use my linux environment which has all sorts of C++ compilers, openGL, etc because of the age of some of the installed components. Instead, I downloaded Microsoft’s Visual Studio Express, the free version. The software took an hour or so to download and install, but the code opened easily and with no issues at all, I generated the images, saved the screenshots and uploaded the resulting images to the auto-grader that compared my local image to their stored image on a pixel by pixel basis and displayed that I had the correct graphic. Instant feedback is a nice feature of these MOOCs.
Computer Graphics done for the week!
I also worked thru the Finance class. One downside to this class is the length of the video lectures, almost 2 hours per week. Even at high speed, this is quite the commitment. The problems sets are pretty basic, but I have already picked up a few cool techniques in the R-programming language that the course uses. One neat trick is the RStudio integrated development environment. I also participated in the discussion section, answering a few simple questions. I figure that this is a small price to pay for a free course.
After about 4 hours, I had done most of the work in the three technical classes, but still nothing in the Greek class.
I’ll probably write new posts every week or so, seems like I may be busy doing the classes themselves!
Two weeks ago, on Tuesday October 9 around 8pm, my 12 year old daughter fell down during hockey practice at the Ed Burns Arena (aka the rink) and broke her arm about two inches below the wrist, fracturing the ulna and radius, displacing them with a high degree of angulation. The x-ray was sickening, for a novice like me, to look at.
Immediately beforehand, I was engrossed in a heated discussion with people in the lobby of the rink about some political foolishness I was involved with, the Special Town Meeting on the following night. All thoughts of which vanished when another dad came to the door to tell me my daughter had been injured.
One good thing about Arlington sports is the number and quality of town employees involved; cops, firemen, rec department personnel. I was fortunate that the fire chief was there and knew exactly what to do. Ice bags appeared as did a temporary sling within what seemed like seconds. Other dads helped take off her skates and collect her equipment from the locker room. Kudos to Arlington’s finest.
What brings us together is more important than what separates us; as the heated argument giving way to the universal need to administer aid to the injured (my daughter) clearly displayed.
We went to Winchester Hospital, breezed through triage and only had to wait for her dinner from 6pm to digest before they gave her a sedative to have a doctor from Excel Orthopedics set her arm. We got home around midnight, both of us exhausted and cranked up from the experience.
The next night, I opened my comments at Town Meeting with “I would rather not be here tonight. I have more important personal and professional things to do.” This was met later, in writing by an elected official, who happens to be childless, with
I thought it was a bit disingenuous of Mr. Harrington to say that he had better things to do
and this statement was singled out as the defining “sound bite” by a controversial, local “journalist” in his blog, all within 12 hours. I wasn’t aware of this for a few days, since the day after the Special Town Meeting, which ended after 11:00PM, I was driving my son at 5:30AM to his high school water polo practice at UMass Boston; his usual mode of public transportation does not run that early.
Elected officials and political opponents are quick to attack and slow to understand things beyond their personal experience.
We returned to Excel twice over the past two weeks for follow up and x-rays. The first x-ray, just two days after the injury, did not lend the appearance, to my inexperienced eye, that the bones were set properly. There was a noticeable displacement on one lateral shot. The doctor assured me that this was not unusual and that the bones never align exactly, that the bones reconstitute themselves, that it was best to set the bones “high” so that movement brought them together. I believed her. She said she wanted to see my daughter in a few days (the following Tuesday). The clerk handling appointments scoffed at an appointment so close, and scheduled us for a week later than the doctor had told me, over my protests.
Excel Orthopedics is a large, very busy, efficient and factory like medical care facility that specializes in sports medicine and bone injuries, or so it appears to me. My daughter had had an appointment there just two weeks before her injury for soreness in her hip and heel. During the consult, the young, male physician’s assistant, asked my daughter about her activities and when told about her ballet, asked her if she “danced in the nude” meaning, apparently, if she danced in bare feet. He was appropriately mortified by his slip, but Freudian or not, I decided at that moment, there would be no follow-up appointment with this provider. He was fired.
We did not choose Excel for the wrist injury, Winchester Hospital decided to use them during the emergency room visit. I was appropriately wary the night of the injury and the follow-up a few days later, but deferred to the doctor. My daughter’s cast was very small, she complained it did not have enough room for all of the signatures from her friends, teammates and school chums. The problems and priorities of a pre-teen!
In fact, she was right in a very important way, the cast was too small. Yesterday, two weeks after the injury, we returned for another x-ray and follow-up. The x-ray clearly showed the ulna and radius not aligned. The doctor had not taken the rotational movement and poor set into account and after a consult that lasted less than 2 minutes wanted to schedule an emergency surgery for the next day (today).
Don’t hesitate to fire a service provider who makes you feel uncomfortable in any way; for a verbal “slip”, assembly-line health care or a too busy doctor.
Later last night, I called a surgeon that I have respect for. She answered her own phone right away. She listened, pulled up the x-rays, confirmed my concerns, made a referral, called the new doctor (head of orthopedics for a major Boston hospital), called me back with an appointment for the very next morning (today), sending me email with the particulars all within about 15 minutes. The new doctor saw us right away today, even though there was a waiting room full of patients, set up the surgery and made sure we were comfortable with the procedure scheduled for tomorrow. What a difference between NEMC and Excel.
Competency speaks for itself and requires no advocacy.
I’m sharing this story since it is not atypical of what many people have to go through on a regular basis, it serves as a nice parable in terms of what is right and what is so clearly wrong with many of our interactions with service providers, be it in health services or in public service and reminds me, anyhow, of what is important in life and what is not.
Stephen Harrington was born in 1961 at Symmes Hospital in Arlington, Massachusetts to Catherine (McDonald) and Laurence P. Harrington, a CPA and chief financial officer of a high tech company in Bedford, MA.
Steve was raised in Arlington, attending the Saint Agnes School and Boston College High School at Columbia Point in Boston. During high school, Steve excelled in math and science and earned a 4.0 GPA, graduating in the top 10 of his high school class. Steve was a member of the prestigious Homeric Academy, translating the Odyssey from the original Greek, and was a member of the debating team. Steve worked as a bank teller for the Arlington Five Cents Savings Bank gaining money handling experience while in high school. Steve swam competitively, enjoyed skiing and many other outdoor activities.
Steve earned a merit scholarship to Boston College where he majored in physics and mathematics and graduated from the University of Lowell in 1983 with a B.Sc. in physics after a combined three years. Steve took part in the Great Books curriculum at Boston College furthering his classical education. During that time, Steve worked full time for the Boston Company as a settlements clerk in Boston to pay for his college education. Steve handled electronic delivery of stocks and bonds for pension and trust clients interfacing with regulatory agencies, other banks and brokers streamlining the bank’s settlement processes and gaining a ground level view of investment trading and custody until 1985.
For the next three years, Steve worked for Fidelity Investments in Boston, designing their global accounting system and working on new business ventures. In 1986, Steve suggested to owner Ned Johnson that Fidelity take custody of their mutual fund assets. Steve was the business expert creating Fidelity Custody Company that employed 600 people by the time Steve left Fidelity in 1987.
After the market crash of 1987, Steve returned to Boston College to study topology and graduate level mathematics. Steve consulted with the Boston Company on global trading and accounting, setting up their mortgage backed bond payment system and duration matching their $14B securities lending portfolio to client payouts. Steve designed bond amortization, short sales accounting and yield calculations for more than $500B in assets.
In 1988, Steve started in a graduate studies program at Boston University, taking over 20 graduate courses in mathematics and researching dynamical systems under Dr. Robert Devaney. At that time, Steve worked with the first generation Connection Machine, a 65,536 node, massively parallel computer, to investigate research problems in dynamical (chaotic) systems. Steve learned graphics computing and earned an honorable mention in the ACM SigGraph exposition with a simulation of fractal Julia sets.
In 1991, Steve formally entered the graduate physics program at Boston University while still working as a consultant to the Boston Company. Steve worked for two years as a researcher under renowned physicist Claudio Rebbi using parallel computers from IBM, Cray, Silicon Graphics and Thinking Machines to investigate research problems in lattice quantum chromodynamics. Steve took over 40 courses in physics, having the opportunity to study under Nobel lauriat Yakir Aharonov among other world-class physicists. In 1993, Steve launched one of the first websites and, in 1996, won a “Best of the Web” award for a java applet for a Schrodinger wave equation simulation. He won another honorable mention at SigGraph for a three-dimensional computer graphics simulation of two colliding wavepackets.
In 1994, Steve switched from theoretical particle physics to statistical mechanics advised by Dr. H. Eugene Stanley. Over the next three years, Steve published 14 papers in peer reviewed journals with three articles in the prestigious Physical Review Letters gaining thousands of citations. Steve’s research focused on spatial and temporal correlations in supercooled and glassy water as well as fundamental research in networks of physical systems. Steve traveled to Asia, Europe and all over the United States to present papers and take part in conferences on a variety of topics in physics. Steve won a highly coveted National Institute of Health award supporting his research and taught undergraduate physics and computational science classes. Steve’s research into the dynamics of water provided insights into protein folding and contributed to an Ether Dome talk on the role of Apolipoprotien-3 in the formation of plaques in Alzheimers patients.
In 1995, Steve married Maria Carr and bought his current home in Arlington.
Steve received his doctorate in Physics in 1997 and was awarded an appointment as a Research Professor of Physics at Boston University for the 1998 school year.
Steve did a short stint at the Genetics Institute in Cambridge where he set up their computational bio-informatics lab and consulted with a start-up in providing an online encyclopedia of engineering processes.
In the fall of 1997, Steve was hired as a “quant” for D.E.Shaw & Co, a Manhattan based hedge fund that pioneered high frequency trading and statistical arbitrage. Steve worked on a multi-manager hedge fund product and developed statistical models for a wide range of investment products. For the first six months, Steve commuted weekly between his midtown Manhattan office and his home in Arlington, living on the upper west side during the week. Steve’s wife Maria and their newborn son, Charles, moved into the NYC area soon after that. Steve gained exposure to a wide variety of hedge fund strategies working for that premier quantitative hedge fund.
In 1999, Steve was offered a position with Batterymarch Financial, a Boston-based institutional asset manager. After a short time, Steve was named one of two portfolio managers overseeing $4B in pensions and endowments using factor based quantitative investment management. Steve developed trading strategies and new products, bringing a small cap portfolio from $5M to $300M over a one year period of marketing. Steve was involved in marketing Batterymarch’s investment products to a wide range of institutional clients. Steve was the portfolio manager for the Legg Mason Market Neutral Mutual Fund, one of the first mutual funds to deploy short positions for mutual fund clients.
In 2000, Steve was hired by Fidelity Investments as the first outside hire in a new venture, managing the firm’s capital at Geode Capital, a Fidelity subsidiary. Steve developed and managed US and Japanese equity statistical arbitrage strategies and was responsible for recommending asset allocation and strategic product development. Steve was a member of Geode’s Investment Committee overseeing the firm’s investments in capital structure arbitrage, convertible bond arbitrage, statistical arbitrage and equity long/short portfolios. Steve helped develop an Exchange Traded Fund that tracked the Nasdaq composite and worked with fundamental analysts at Standard & Poors to identify opportunities in forensic accounting, avoiding losses in Enron, Adelphia and other fraudulent corporate managers.
By 2004, Steve decided to form his own investment management company, Scientific Advisors, LLC and registered as an independent investment adviser. During the first four years, Steve sold trading models to a variety of hedge funds covering equity, currency, commodity and derivative asset classes. Steve also developed a series of risk-based retirement fund products that he continues to sell through broker dealers. Steve is registered as an investment adviser in Massachusetts, New York and Texas and has never had a complaint.
Steve’s academic and work history displays two common threads; an interest combined with talent in mathematics and science as well as deep experience with some of the premier companies in finance. Steve has always worked at the intersection of finance and technology. Today Steve enjoys his family, living in the town he grew up in, mentoring technology based start-up companies among other business interests.
Steve credits his success in life to the support and love of Maria with whom he has three healthy, happy children.