Thank you for your donation!


Cloudsmith graciously provides open-source package management and distribution for our project.


Idea: Genius
#6
I tested then link from above with the python script.
Needed some fixes, but finally got it working - and it is much faster than the node solution I posted above.
But maybe the aprroach is not the best as it searches first for a song title... well, if this title is quite common I think You know what will happen... :Big Grin

EDIT: I found out that You can add the artist to the songtitle query - so the query is nicely narrowed down
EDIT: also added a query for the song that is currently playing in mpd

So - for those who are interested:

-In the line at the beginning: 
headers = {'Authorization': 'Bearer <<<YOUR Client Access Token>>>'}
You will have to enter Your Client Access Token

I left some prints so You can see what's happening

Code:
#!/usr/bin/python3
# -*- encoding: utf-8 -*-

import requests
from bs4 import BeautifulSoup
from os import popen
from time import sleep
base_url = "https://api.genius.com"
headers = {'Authorization': 'Bearer <<<YOUR Client Access Token>>>'}
artist_name = popen('mpc --format %artist% | head -n 1').read().replace("\n","")
song_title = popen('mpc --format %title% | head -n 1').read().replace("\n","")

def lyrics_from_song_api_path(song_api_path):
  song_url = base_url + song_api_path
  response = requests.get(song_url, headers=headers)
  json = response.json()
  path = json["response"]["song"]["path"]
  #gotta go regular html scraping... come on Genius
  page_url = "https://genius.com" + path
  print(page_url)
  print()
  print()
  res= None
  i=0
  while res == None:
    if i > 5: break
    page = requests.get(page_url)
    html = BeautifulSoup(page.text, "html.parser")
  #remove script tags that they put in the middle of the lyrics
#  [h.extract() for h in html('script')]
  #at least Genius is nice and has a tag called 'lyrics'!
    res = html.find('div', class_='lyrics')
    i+=1
    sleep(1)
  if res:  
    lyrics = res.get_text() #updated css where the lyrics are based in HTML
    return lyrics
  else:
    return "Found lyrics page but unable to query the lyrics"
if __name__ == "__main__":
  search_url = base_url + "/search"
  data = {'q': song_title + ' ' + artist_name}
  response = requests.get(search_url, params=data, headers=headers)
  print(response.url)
  json = response.json()
  song_info = None
  for hit in json["response"]["hits"]:
    if hit["result"]["primary_artist"]["name"] == artist_name:
      song_info = hit
      break
    else:
      print(hit)
  if song_info:
    song_api_path = song_info["result"]["api_path"]
    print(song_api_path)
    print()
    print(artist_name + " - " + song_title)
    print (lyrics_from_song_api_path(song_api_path))
Cheers, Stephan

EDIT: while the query is well narrowed down, there was still a problem with requesting the final page containing the lyrics - dunno what's happening, maybe the page is not completely loaded - but, I created a loop which tries to receive the content in several attempts... seems to work atm
Reply


Messages In This Thread
Genius - by kit1cat - 12-20-2020, 08:18 PM
RE: Genius - by Tim Curtis - 12-20-2020, 08:39 PM
RE: Genius - by Stephanowicz - 12-20-2020, 09:34 PM
RE: Genius - by kit1cat - 12-20-2020, 09:24 PM
RE: Genius - by kit1cat - 12-20-2020, 09:43 PM
RE: Genius - by Stephanowicz - 12-22-2020, 05:19 PM
RE: Genius - by Stephanowicz - 12-24-2020, 06:54 PM
RE: Genius - by Tim Curtis - 12-24-2020, 10:06 PM
RE: Genius - by Stephanowicz - 01-17-2021, 11:42 AM
RE: Genius - by kit1cat - 01-17-2021, 12:31 PM
RE: Genius - by Stephanowicz - 01-17-2021, 12:58 PM
RE: Genius - by kit1cat - 01-17-2021, 05:00 PM
RE: Genius - by kit1cat - 01-17-2021, 05:56 PM

Forum Jump: