Login

Stephanowicz · (This post was last modified: 12-22-2020, 09:32 PM by Stephanowicz.)

I tested then link from above with the python script.
Needed some fixes, but finally got it working - and it is much faster than the node solution I posted above.
But maybe the aprroach is not the best as it searches first for a song title... well, if this title is quite common I think You know what will happen... : Big Grin

EDIT: I found out that You can add the artist to the songtitle query - so the query is nicely narrowed down
EDIT: also added a query for the song that is currently playing in mpd

So - for those who are interested:

-In the line at the beginning:
headers = {'Authorization': 'Bearer <<<YOUR Client Access Token>>>'}
You will have to enter Your Client Access Token

I left some prints so You can see what's happening

Code:
#!/usr/bin/python3

# -*- encoding: utf-8 -*-

import requests

from bs4 import BeautifulSoup

from os import popen

from time import sleep

base_url = "https://api.genius.com"

headers = {'Authorization': 'Bearer <<<YOUR Client Access Token>>>'}

artist_name = popen('mpc --format %artist% | head -n 1').read().replace("\n","")

song_title = popen('mpc --format %title% | head -n 1').read().replace("\n","")

def lyrics_from_song_api_path(song_api_path):

  song_url = base_url + song_api_path

  response = requests.get(song_url, headers=headers)

  json = response.json()

  path = json["response"]["song"]["path"]

  #gotta go regular html scraping... come on Genius

  page_url = "https://genius.com" + path

  print(page_url)

  print()

  print()

  res= None

  i=0

  while res == None:

    if i > 5: break

    page = requests.get(page_url)

    html = BeautifulSoup(page.text, "html.parser")

  #remove script tags that they put in the middle of the lyrics

#  [h.extract() for h in html('script')]

  #at least Genius is nice and has a tag called 'lyrics'!

    res = html.find('div', class_='lyrics')

    i+=1

    sleep(1)

  if res:  

    lyrics = res.get_text() #updated css where the lyrics are based in HTML

    return lyrics

  else:

    return "Found lyrics page but unable to query the lyrics"

if __name__ == "__main__":

  search_url = base_url + "/search"

  data = {'q': song_title + ' ' + artist_name}

  response = requests.get(search_url, params=data, headers=headers)

  print(response.url)

  json = response.json()

  song_info = None

  for hit in json["response"]["hits"]:

    if hit["result"]["primary_artist"]["name"] == artist_name:

      song_info = hit

      break

    else:

      print(hit)

  if song_info:

    song_api_path = song_info["result"]["api_path"]

    print(song_api_path)

    print()

    print(artist_name + " - " + song_title)

    print (lyrics_from_song_api_path(song_api_path))

Cheers, Stephan

EDIT: while the query is well narrowed down, there was still a problem with requesting the final page containing the lyrics - dunno what's happening, maybe the page is not completely loaded - but, I created a loop which tries to receive the content in several attempts... seems to work atm

Login
Username:
Password:	Lost Password?
	Remember me