Tag Archive for 'jsoup'

Groovy : Parse All Soccer Players Info

I am new to groovy and am still getting used to the scripting way of thing coming from Java. So as a learning exercise I wrote up the following lines to parse information of all the soccer players from ESPN Soccernet. I have used the Jsoup library to get the document and parse it.

def leagues = [
        "http://soccernet.espn.go.com/clubs/_/league/eng.1/english-premier-league?cc=4716",
        "http://soccernet.espn.go.com/clubs/_/league/esp.1/spanish-la-liga?cc=4716",
        "http://soccernet.espn.go.com/clubs/_/league/ita.1/italian-serie-a?cc=4716",
        "http://soccernet.espn.go.com/clubs/_/league/ger.1/german-bundesliga?cc=4716",
        "http://soccernet.espn.go.com/clubs/_/league/fra.1/french-ligue-1?cc=4716",
]

leagues.each {leagueUrl ->
    Utils.getDocument(leagueUrl).select("table[class=tablehead]").get(0).select("td:eq(2)").select("a[href]").each {teamStatsUrl ->
        Utils.getDocument(teamStatsUrl.attr("abs:href")).select("tbody").each {playerGroup ->
            playerGroup.select("td:eq(1)").select("a[href]").each {playerLink ->
                Element playerProfile = Utils.getDocument(playerLink.attr("abs:href")).select("div.profile").get(0)
                String playerName = playerProfile.select("h1").text()

                def profilePrperties = [:]
                playerProfile.select("li").each {item ->
                    String[] itemProperties = item.text().split(":")
                    if(itemProperties.size() == 1) profilePrperties.get("teams", []).add(itemProperties[0])
                    else profilePrperties[itemProperties[0]] = itemProperties[1]
                }
                println playerName + " " + profilePrperties
            }
        }
    }
}

All that Utils.getDocument(url) here does is to call Jsup.connect(url).get() within a loop with number of retries set to 5. The script produces output as follows:

Ramires [Full Name: Ramires, Squad No: 7, Position: Midfielder, Age: 24, Birth Date: Mar 24, 1987, Birth Place: Barra do PiraĆ­, Rio de Janeiro, Brazil, Height: 5' 11'' (1.80m), Weight: 73 kg, teams:[Brazil, Chelsea]]
Frank Lampard [Squad No: 8, Position: Midfielder, Age: 33, Birth Date: Jun 21, 1978, Birth Place: Romford, Height: 6' 0" (1.83m), Weight: 174 lbs (78.7 kg), teams:[England, Chelsea]]
Fernando Torres [Full Name: Fernando Torres, Squad No: 9, Position: Forward, Age: 27, Birth Date: Mar 20, 1984, Birth Place: Fuenlabrada, Madrid, Height: 6' 1'' (1.85m), Weight: 174 lbs (78.7 kg), teams:[Spain, Chelsea]]
John Mikel Obi [Squad No: 12, Position: Midfielder, Age: 24, Birth Date: Apr 22, 1987, Birth Place: Jos, Nigeria, Height: 5' 11'' (1.80m), Weight: 179 lbs (81.3 kg), teams:[Chelsea]]
Raul Meireles [Full Name: Raul Meireles, Squad No: 16, Position: Midfielder, Age: 28, Birth Date: Mar 17, 1983, Birth Place: Porto, Portugal, Height: 1.79m, Weight: 65 kg, teams:[Chelsea, Liverpool, Portugal]]
Branislav Ivanovic [Squad No: 2, Position: Defender, Age: 27, Birth Date: Feb 22, 1984, Birth Place: Sremska Mitrovica, Yugoslavia, Height: 6' 2" (1.88m), Weight: 86 kg, teams:[Serbia, Chelsea]]
Juan Mata [Full Name: Juan Mata, Squad No: 10, Position: Forward, Age: 23, Birth Date: Apr 28, 1988, Birth Place: Burgos, Spain, Height: 1.70m, Weight: 61 kg, teams:[Spain, Valencia, Chelsea, Spain U21]]

Popularity: 2% [?]