Skip to content

Category: Science and Technology

Interesting posts about science and technology.

Composing an AI Symphony: Weather Poems Transformed into Art and Sound

After my initial weather-driven AI poetry experiments, I wanted to go to the next level to create more engaging videos. This time I used multiple AI tools to create the poetry, speech, music, and visuals. Every element, from words to images to sound, reflects the creative capabilities of AI in capturing the essence of weather and nature. The result is a ten-minute compilation showcasing poems generated by ChatGPT interpreting weather data with no text edits.

Visuals

To create the visuals, I created a JSON file containing the information saved from generating several weather poems. Then I wrote a Python script that looped through the file, asking ChatGPT to generate an image based on the poem text and its corresponding weather data.

I then took each image and the audio from each poem to create animations using Kaiber.ai.

Voices

To add some texture to the vocals, I modified my original weather poem process to exclude background music, leaving only video and speech. I also added a random choice between four voice models from ElevenLabs. I then used another Python script to add time-stretched tracks to the vocals, giving it a more otherworldly feel.

To be honest, my wife was not impressed with this decision: she had trouble making out the speech and recommended I at least create subtitles (which I did, but you may have to turn them on manually).

Music

All the background music was composed by AIVA.ai. I picked from one of a thousand tracks or sections of track I’ve generated over the course of a year, then put them together manually in Audacity.

Final Composition

Once I had my audio and visual elements, I put them together using the OpenShot video editor (in case you haven’t noticed, I like to use open source software whenever I can).

Conclusion

Overall, this was another really fun learning experience. I like what came out, flaws and all.
And there are definitely flaws.

The poems are interesting, but are clearly hallucinating at times. Granted, I gave ChatGPT permission to do so, so I shouldn’t be surprised.

I do wish I had more control over the animations, or better yet automate the process, but I don’t know how to do that. Yet. Same goes for the additional processing for the vocals and music.

More importantly I want my next project to have more of my poetic voice. This particular project has its merits and its own beauty – but it showcases more of my skills as a programmer than as a poet.

Actually…I’m going to take that back. I’m going to go out on a limb and posit that this is one of my voices, one that is an amalgamation of the programmer, the poet, the LLM, the external data, and the AI services.

I feel strongly that I must be in the loop when it comes to art that I claim is mine. Where I am in that loop, however, will depend on the art I am creating and why I’m creating it.

That won’t sit well with many creators or consumers of art.

To those folks I say this: So be it – I’ll see you in the marketplace of ideas.

Comments closed

A Real Poet Writing a Poem with AI

This is a somewhat real-time deep dive into one of the ways I use generative AI (in this case ChatGPT4) to develop a poem, along with my views on doing so.

Apologies in advance for the clickbait-y title.

After searching YouTube for videos on the topic, I wound up being more than a little disappointed. All I found were either rants about “AI will ruin poetry” or demos of “look at how cool this is with only a prompt or two”.

I can do better than that, even with an admittedly less than perfect presentation.????

Be warned, this is a long one, folks, so buckle up.

Comments closed

NaPoGenMo 2024: Final Notes

So…very late, but I got something out there. I made heavy use of Microsoft Azure components like Logic Apps, Azure Functions, and Storage Accounts. And I got to use a lot of Python.

In the end, I put together something that generates – or attempts to generate – a poem three times a day, based on the weather from a random US city and state. The github repo only contains the python code along with a couple of unfinished sections.

There’s more that I would like to do, like automatically upload to YouTube, but for now I’m going to call this “good enough”.

My wife did offer some possible uses of the code as it stands: I could create generate poems for all the cities in one state and make that an audiobook. Or take one city and create a month’s worth of poems. Might just try that, as that takes little effort.

Anyway, I hope someone finds this interesting/useful:
https://github.com/bohara2000/NaPoGenMo2024-PoeticDigestiveSystem

Comments closed

Overthinking Much?

I’ve spent the past couple of days just writing out plans for how to put together my project for National Poetry Generation Month. And to be honest, I’m getting to the point where I feel like I’m overthinking things.

I’ve written so much and doodled so much that I think that if I don’t actually just start putting code to IDE, I’m never going to get this thing done.

So let’s just get started.

NaPoGenMo 2024 API design notes - pt 1
Possible ways to separate the functions for the project

NaPoGenMo 2024 API design notes - pt 2
What do I want the LLM to do? Regardless, this project requires a human to remain in the loop.

NaPoGenMo 2024 API design notes - pt 3
Honing the flow of data. I want to take cues from the Oulipo movement and computer poetry research.

Comments closed

Poem: The Garden of the Patrons

Took me a while, but I’ve created a little video of my poem, “The Garden of the Patrons,” which was published in Pandemic Atlanta 2020 magazine, an assortment of artwork, literature, poetry, and photography documenting the experiences of Atlanta-based artists during the COVID-19 pandemic.

Hope you enjoy.

Comments closed

Poetic Tech at the Decatur Book Festival: The “Bloop-Bleep” Stage

On Sunday, August 31, 2014, I did a presentation on the intersection of technology and poetry at art|DBF, an art-oriented segment of the Decatur Book Festival. The presentation was the culmination of several months of coding to develop a system that allowed a poet and an audience to create an interactive soundscape.

 

Why did I do this?

Most people, when they think of poetry, they think of it as this fundamentally human, often life-affirming human activity.

Most people, when they think of technology, think of it as this inhumane, if not inhuman, often soul-crushing process.

This is a false dichotomy, of course. Poetry and technology are both artifacts of what humans do. They are both profoundly human acts.

From the campfire to the cathedral, from the crystal AM radio to the liquid crystal display, our technology has affected what form poetry takes, who creates it, who listens to it, where it is experienced, and how it is distributed.

My intent was to build a demonstration of one possible way to enhance the the experience of poetry for both poet and audience.

 

How did it work?

I built a web based audio application that controlled sounds with smartphones.

The phones accessed a web server running on my laptop. The pages for the audience could read through the poems being performed and manipulate sounds using one of three instruments.

Audience UI

The audience accesses a website via smartphone. The site offers a view of the current poem, a dropdown selection of poems, and links to one of three musical interfaces, the first of which displays by default.

The three interfaces do the following things:

  • Instrument 1: create a rain stick-like sound with different effects based moving a point within an small window;
    swipe demo - Chromium_465
  • Instrument 2: a set of four percussion pads;
    swipe demo - Chromium_466
  • Instrument 3: a text area that creates sounds for each word typed.
    swipe demo - Chromium_467

Poet UI

The interface for the poet has six options – unfortunately only four worked at the time of performance, and only three worked without issues.

  • Effect 1: pitch follower creating audio effect a fifth higher than detected frequency
  • Effect 2: pitch follower creating audio effect a seventh higher than detected frequency
  • Effect 3: Multicomb filter
  • Effect 4: Spectacle filter
  • Effect 5: Hypnodrone – drone effect kicked off by detected amplitude
  • Effect 6: Stutter – warbling bass line using sine oscillator. Originally intended to create a glitch effect.

Poet UI - Chromium_468

 

The speaker had a separate interface for adding vocal effects and a background beat.

The web pages sent messages to a set of ChucK scripts running on my laptop. The scripts generated the sounds and altered the vocals as well as recorded the presentation.

 

How did it go?

The presentation itself was well-received. It was in the tent for Eyedrum, an Atlanta-based, non-profit organization developing contemporary art, music and new media in its gallery space.

I did my presentation outside with a set of powered PC speakers attached to the laptop. Later, I borrowed a PA and mixer from my friend and fellow poet Kevin Sipp. By the way, check out his debut graphic novel, The Amazing Adventures of David Walker Blackstone:

 

1655583_269790189852487_1506966739_o

The laptop was attached to a wireless router that passersby could use to connect to the website. Everyone was able to connect and interact with the site. There were some glitches – which I’ll talk about later – but for the most part, people seemed intrigued by the possible uses of mobile and web technology for poetic performances.

A couple of components either did not perform as expected or did not work at all. Of the audience-specific pages, Instrument 3 did not play or was at too low a volume to be heard over the ambient sounds of the festival. There were also some issues with switching between poems.

The poet-specific pages had issues with two of the six effects: “Multicomb” and “Spectacle”. The multicomb filter had a problem with feedback and was too loud. The spectacle effect didn’t work at all. In addition, the audio started suffering from latency issues. The recording of the first twenty minutes of the presentation started suffering from unintended glitching and was pretty much ruined. The recording of the last fifteen minutes was a little better (I stopped the recording to switch to Kevin’s PA setup), but suffered from the same issue not long into the presentation.

 

Conclusion

Overall, I think the presentation was well-received, and people were intrigued by what they heard. The issues with the setup became clear when I reviewed the recordings. There’s definitely room for improvement, and I will definitely build upon this design for future performances.

So good, bad, or ugly, I’m posting both recordings (Part 1 and Part 2) and the code for all to see.

Despite the issues, I consider the project a success. This is a prototype, so I expected some problems. Luckily, none of the problems were catastrophic. There were lots of bloops and bleeps, but nothing went “boom”. It would only have been a failure if I had learned nothing from the experience.

Until next time, check out the code, play with, let me know if you use it or modify it.

 

Comments closed

Enhancing Poetry With Pitch-Following Effects And Sounds, Part 2: Interesting Mistakes

I’ve put together a proof of concept for enhancing poetry with ChucK scripts, however, I soon realized that I wasn’t actually doing pitch-following. Instead, the code I put together was something called an “envelope follower”. I’ve uploaded the code to GitHub in case anyone wants to play around with it (you’ll need ChucK and the Audicle or miniAudicle IDE).

My physics is really rusty, so the best way I can explain it is that instead of checking the pitch of the voice to determine whether to kick off an effect, the script checks the *power* of the voice. I interpret this as more of a measurement of inflection or stress.

Not exactly what I’d planned, but it’s in the right direction.

This first draft of the script taught me a few things about how to build ChucK scripts that would respond to vocal input. For starters, I now have a new dimension to the vocals that I can use to kick off effects. Currently the threshold used to determine when the effects start has to be manually adjusted, but that could be dynamically changed through some other criteria like external data feeds or input by other people.

I also found that I needed to have a means to stop as well as start effects. When I first put the code together without having a means to stop an effect, the result got noisier and louder until I manually stopped the program.

I also wanted to vary the duration of the effects, so I did the following: (1) I included a global class for setting tempo and note durations; then (2) I added an array of time durations and looped through them each time an effect got kicked off.

Most of the resulting code is cobbled together from existing code examples found on the internet. My coding philosophy for the most part is based on what I call “the thieving magpie”: find components that do what I want (or close to it), slap them together, then modify as needed until I get the desired result.

The poem I used for the demo is “The Seekim”, by Sidney H. Sime. It comes from the book “Bogey Beasts”, which is out-of-print and hard-to-find. Each poem was written and illustrated by Sime; each poem also had a musical score written by Joseph Holbrooke. I’ve never heard the music performed, but the book fascinated me. I’m still kicking myself for having sold it at a used book store almost twenty years ago.

[soundcloud url=”https://api.soundcloud.com/tracks/157563219″ params=”auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&visual=true” width=”100%” height=”450″ iframe=”true” /]

So even though I didn’t exactly know what I was coding, I got some results I liked, and learned enough to start thinking about next steps.

Leave a Comment

The Fungi from Yuggoth Project – Programmatic (and Problematic) Composition

I had never heard the phrase, “You get what you get and you don’t get upset” until I was listening to a lecture on poetry on CDs  with my son Jack. It made me think about this project of mine – creating an audio book of H.P. Lovecraft’s Fungi from Yuggoth. Not only was I recording my spoken version of it, but I was adding original soundtracks. And to put the cherry on this Geek Sundae, I was going to write code that would “render” the music for me.

The task was – and still is – daunting, and I’m uneasy about how it’s coming out. I can tell right now that more than half of this project will prove very difficult for a lot of people to listen to.

But you know what? To hell with it – this is fun for me…

My criterion for success is pretty simple: the project will be complete when all 36 poems are posted on Bandcamp.

The project consists of two versions of each poem – “compressed” and “uncompressed”. More on that later…

The music is created as the poem is typed. Each key pressed creates a note with a duration. Vowel keys and the space bar kick off samples or percussion instruments.

I’m using a programming language called ChucK for creating the music. I discovered the language while browsing for online classes at Coursera. The site had a class called “Introduction to Programming for Musicians and Digital Artists”. If you’re interested in using programming to create music, I recommend this course – it’s well-organized and you learn something regardless of whether you start as a coder or a musician.

To use this language, you’ll need to install ChucK and its development environment, miniAudicle.
You can get them both here. I’m not going to get into the installation process – the ChucK website has a page devoted to that.

I use five scripts to create the music:

  • initialize.ck – this calls the master script, score.ck
  • score.ck – this calls three scripts needed to create and record the music
  • BPM.ck – this program defines Beats Per Minute (BPM) as well as named note durations (from whole note to 32nd note)
  • mechanical-typist.ck – this script is the heart of the music “rendering” system. It defines the rules and the instruments used. It also listens for the keyboard input that plays the instruments and effects.
  • rec-auto-stereo.ck – this is the recording script. It records until you shut off all the “Shreds” or pieces of code running in ChucK.

There is also a folder called “audio” containing all the audio samples used by the scripts.

Each of these scripts was based on either the examples used in the Coursera class or examples on the ChucK website.

I’m making  the files I used to create the music available as a zip file on my Google Drive, so feel free to play with them and create your own pieces.

Here’s an example of what a “rendered” composition sounds like:

[soundcloud params=”color=33e040&theme_color=80e4a0″]https://soundcloud.com/bryant-ohara/sonnet-xi-the-well-test[/soundcloud]

I’ve taken these  initial renderings and done additional processing in Audacity.

Here are some examples of a “compressed” and “uncompressed” version of the poem, “Night-Gaunts”:

[soundcloud params=”color=33e040&theme_color=80e4a0″]https://soundcloud.com/bryant-ohara/the-fungi-from-yuggoth-sonnet[/soundcloud]

[soundcloud params=”color=33e040&theme_color=80e4a0″]https://soundcloud.com/bryant-ohara/the-fungi-from-yuggoth[/soundcloud]

You may have noticed that the uncompressed version is significantly longer than the compressed version. I was initially at a loss for how best to present the poems. I didn’t want to use the rendered music solely as raw material – the rendering is the actual text of poem, just transformed into sound. Each rendering is a tone poem in a very literal sense.

That still doesn’t make it any easier to listen to, which is why I’m adding heavily processed version of the vocal track to the uncompressed pieces. As I progress in the project, I’ll think about what else, if anything , to add.

Stay tuned for more updates on the project!

Leave a Comment