12-22-2020, 05:19 PM
(This post was last modified: 12-22-2020, 09:32 PM by Stephanowicz.)
I tested then link from above with the python script.
Needed some fixes, but finally got it working - and it is much faster than the node solution I posted above.
But maybe the aprroach is not the best as it searches first for a song title... well, if this title is quite common I think You know what will happen... :
EDIT: I found out that You can add the artist to the songtitle query - so the query is nicely narrowed down
EDIT: also added a query for the song that is currently playing in mpd
So - for those who are interested:
-In the line at the beginning:
headers = {'Authorization': 'Bearer <<<YOUR Client Access Token>>>'}
You will have to enter Your Client Access Token
I left some prints so You can see what's happening
Cheers, Stephan
EDIT: while the query is well narrowed down, there was still a problem with requesting the final page containing the lyrics - dunno what's happening, maybe the page is not completely loaded - but, I created a loop which tries to receive the content in several attempts... seems to work atm
Needed some fixes, but finally got it working - and it is much faster than the node solution I posted above.
But maybe the aprroach is not the best as it searches first for a song title... well, if this title is quite common I think You know what will happen... :

EDIT: I found out that You can add the artist to the songtitle query - so the query is nicely narrowed down
EDIT: also added a query for the song that is currently playing in mpd
So - for those who are interested:
-In the line at the beginning:
headers = {'Authorization': 'Bearer <<<YOUR Client Access Token>>>'}
You will have to enter Your Client Access Token
I left some prints so You can see what's happening
Code:
#!/usr/bin/python3
# -*- encoding: utf-8 -*-
import requests
from bs4 import BeautifulSoup
from os import popen
from time import sleep
base_url = "https://api.genius.com"
headers = {'Authorization': 'Bearer <<<YOUR Client Access Token>>>'}
artist_name = popen('mpc --format %artist% | head -n 1').read().replace("\n","")
song_title = popen('mpc --format %title% | head -n 1').read().replace("\n","")
def lyrics_from_song_api_path(song_api_path):
song_url = base_url + song_api_path
response = requests.get(song_url, headers=headers)
json = response.json()
path = json["response"]["song"]["path"]
#gotta go regular html scraping... come on Genius
page_url = "https://genius.com" + path
print(page_url)
print()
print()
res= None
i=0
while res == None:
if i > 5: break
page = requests.get(page_url)
html = BeautifulSoup(page.text, "html.parser")
#remove script tags that they put in the middle of the lyrics
# [h.extract() for h in html('script')]
#at least Genius is nice and has a tag called 'lyrics'!
res = html.find('div', class_='lyrics')
i+=1
sleep(1)
if res:
lyrics = res.get_text() #updated css where the lyrics are based in HTML
return lyrics
else:
return "Found lyrics page but unable to query the lyrics"
if __name__ == "__main__":
search_url = base_url + "/search"
data = {'q': song_title + ' ' + artist_name}
response = requests.get(search_url, params=data, headers=headers)
print(response.url)
json = response.json()
song_info = None
for hit in json["response"]["hits"]:
if hit["result"]["primary_artist"]["name"] == artist_name:
song_info = hit
break
else:
print(hit)
if song_info:
song_api_path = song_info["result"]["api_path"]
print(song_api_path)
print()
print(artist_name + " - " + song_title)
print (lyrics_from_song_api_path(song_api_path))
EDIT: while the query is well narrowed down, there was still a problem with requesting the final page containing the lyrics - dunno what's happening, maybe the page is not completely loaded - but, I created a loop which tries to receive the content in several attempts... seems to work atm