AI Media Production and Training | Music, Speech, Text, Code, Games, Image, Video, Web, Photo and much more.

Generative Artificial Intelligence - AI in Theory and Practice

Hello, my name is Johann Dirschl, managing director of DIRSCHL.com GmbH, specializing in AI, audiovisual media, and training. On nuonu.com, we test generative AI, i.e., models capable of creating new content. We distinguish the following generative AI areas:

Music & Speech

KI Musik und Audioproduktion Johann Dirschl, nuonu.com, DIRSCHL.com GmbH

Suno, Frescobaldi, LiliPond, Udio, ElevenLabs

Text, Code

ChatGPT, Gemini, Meta.AI, Grok DeepSeek, Mistral, Perplexity, Cursor

Image & Video

KI Text und Code, Johann Dirschl, DIRSCHL.com GmbH, generative AI, ChatGPT

Midjourney, DALL·E, Firefly, Stable Diffusion, Runway Gen-2, Pika Labs, Flux

Web, Games

KI Plugin- und Webentwicklung mit WordPress und DIVI, Webapps, DIRSCHL.com GmbH, generative AI

Websites, SEO, Accessibility, Plugins, WebApps, Pagespeed and much more.

Photo & Editing

KI Fotografie und RAW Entwicklung, Johann Dirschl, DIRSCHL.com GmbH, Aftershoot, generative AI, Photoshop

Aftershoot, Topaz, Adobe Photoshop, Lightroom, Bridge

Learn about Artificial Intelligence through practical examples and live presentations

Generative Artificial Intelligence has practically arrived everywhere. My task is to test it, create my own workflows, and produce practical examples as well as final products.

Many know me as a programmer, lecturer, or photographer. I try things out, I'm a tech nerd, and always looking for new, better alternatives to speed up work processes. Workflow is key and has changed so much in many areas that, especially with AI, the general public can cope with it even without specialized knowledge.

Every AI technology is initially hated by many because not everyone is ready to accept the new realities. Then follows a period where usage and refusal converge, and shortly thereafter, everyone uses it. In the end, AI's knowledge and every automation will far surpass humans. The AGI moment, the point at which there will be Artificial General Intelligence, has already arrived in many areas without anyone noticing.

Don't miss the moment, because your entire life and the way you will work depend on it.

Glad you're here.

Johann Dirschl, DIRSCHL.com GmbH

AI Music and Speech: Revolution through Artificial Intelligence

Artificial intelligence has made enormous progress in music and audio generation in recent years. AI-powered tools make it possible to generate high-quality music pieces or realistic speech syntheses within seconds. Three of the currently most significant platforms in this area are Suno, Udio, and ElevenLabs.

Suno

Suno is an AI music platform that allows users to generate complete songs with lyrics, melody, and instrumentation. It is based on powerful language models and is particularly suitable for creative applications such as songwriting and sound design. Suno is characterized by:

Easy Operation: Music is generated through simple text input.
Complete Tracks: The AI generates not only instrumentals but also vocals.
Flexibility: Users can influence styles and genres.

Udio

Udio is another advanced AI platform for music production. In contrast to Suno, Udio places a special focus on professional sound quality and artistic freedom. Some of the main features are:

Precise Control: Users can work in more detail on arrangements and mix.
High-Quality Audio Output: Professional production without a studio setup.
Diverse Styles: From electronic music to orchestral pieces.

ElevenLabs

ElevenLabs is an AI-powered speech synthesis platform that can generate natural-sounding voices. It is particularly interesting for voice-overs in videos, podcasts, and interactive media. The most important features:

Realistic AI Voices: Voices sound human and emotional.
Multilingual Support: Ideal for international productions.
Individual Voices: Users can create their own voice profiles.

Conclusion and Examples

I mainly create music of all kinds, and as a musician, I can say that Suno, as of October 2025, is the best platform for music in the field of artificial intelligence. It recognizes protected works, allows free titles, and can handle all languages, dialects, rhythms, instrumentation, etc... meanwhile, fragments can also be exchanged, covers created, personas defined to assign a recognizable style, singer... It reacts to BPM specifications and key signatures.

This means it works better than 90% of all musicians. They can use Suno to complete their ideas or even hand over the entire composition to Suno.

I've been with Suno since V 2.5 and have already created many albums and music tracks with it. As a good practical example, in addition to German and English original compositions, there are also Instrumental, Shorts, and Christmas songs. My idea was to recreate the old, already GEMA-free Christmas songs with the help of artificial intelligence. In comparison, I also tried to generate my own Christmas songs. Mostly with lyrics from ChatGPT, which were individually adapted and based on my ideas. These are "New Christmas Songs" with original compositions (key, tempo, instrumentation, mood, vocal specifications...), so no stolen lyrics and melodies. For Christmas markets etc... pure Christmas background music is also sufficient, which I also created myself or based on old GEMA-free Christmas music.

You don't make friends with everyone this way, but it's clear that AI is capable of creating music perfectly.

Legally, exploiters are trying to regulate these platforms, as they have also used well-known music pieces for training. Opposing this is the principle that musicians also learn from other musicians and music is not reinvented. No matter how you look at it, we will not be able to avoid the topic of artificial intelligence in music, and probably many music pieces in the charts are already created by musicians with the help of these tools.

Currently, I need about 3 hours for my own song with my own lyrics, mastering, cover, and publication. My main goal is to recognize current changes in artificial intelligence early and constantly adapt my workflow. This also creates an audio pool that I can then use for photo shows, Christmas parties, etc... without hesitation. In addition to 140 handmade musical works that we offer with notations at https://www.vladimirsterzer.com, there are now also more than 150 tracks with over 5 hours of AI music available. These are a nice addition, as they are versatile and can be individually adapted to videos etc... or newly generated.

nuonu is the name of our old band, in which I played guitar, bass, synthesizer and also did the recording. That's why I like to call myself a musician, even if the tools are different today. By habit, all AI tracks also go through Logic with me. However, with the newer versions, you don't really have to do anything to the tracks anymore. Nevertheless, exporting stems (individual tracks), the old playful mixing and mastering are revived. At least I always load the WAV in Logic and increase the volume to zero. I make no manual changes anymore. However, I do have the exact speed and tempo determined, fade in/out, volume, bounce MP3 and WAV. Some tracks also received a cash register sound (Kakakakakaufen) and if the vocals are too quiet for me, I sometimes work with stems.

However, the Music Production Workflow is still intact and only rarely needs to be adapted to new possibilities. A small page on Marketing AI Music shows possibilities that are less relevant for me personally. AI music should primarily be fun, save time, promote one's own creativity, help musicians with composition, and bring more royalty-free music to the market.

Generative AI for Text and Code

Artificial intelligence is revolutionizing not only music and audio but also the creation of texts and code. Modern AI models can generate texts, create summaries, write code, and even solve complex problems. Here are some of the most important AI models for this area:

ChatGPT

https://chatgpt.com/

ChatGPT by OpenAI is one of the best-known language models, used for both general texts and programming applications. Its main features are:

Conversational Ability: Ideal for dialogues, creative texts, and information retrieval.
Code Generation: Supports the programming of applications, websites, and plugins.

I have been a user from the very beginning and use it daily for texts, all kinds of questions, and programming. Especially solving complex analyses, e.g., of web source code, SEO, etc., it is still indispensable for me today. I often create the first basic version of a plugin with Chat and then switch to Cursor for larger applications, which can view and manage entire projects. Could I do without the Plus account? Yes, because Cursor alone would support my development work well. I also get a second opinion from other LLMs. For most applications, the free version of ChatGPT is sufficient.

Google Gemini

https://gemini.google.com/

Gemini by Google is a multimodal model that can process text, images, and even audio. It is characterized by:

Multimodal Capabilities: Processing different data types.
Optimized for Research: Helps with information retrieval.
Good Context Processing: Provides thoughtful answers with a logical structure.
Free API Usage for Developers: Allows the creation of your own programs with artificial intelligence.

It is currently more important to me than ChatGPT.

Grok

https://grok.com/

X.ai is great when it comes to getting rather honest content. It uses X as an additional data basis, so you are particularly close to current information and the opinions of companies and users. Image and code generation is also very good.

It was pretty good even in the first version and can still be used in a free variant today. It's definitely worth a look.

Claude

https://claude.ai/

Claude by Anthropic places particular emphasis on security and ethical aspects in AI usage. Its most important features are:

High Text Quality: Focused on natural and consistent texts.
Security-Oriented: Reduces risks of misinformation.
Good Context Memory Area: Can capture and process long conversations.

It is my preferred AI in the code area; Sonnet is practically the standard in Cursor. Claude Sonnet is super fast and qualitatively very good.

DeepSeek

https://www.deepseek.com/

DeepSeek became known for using different AI hardware, and the processing speed and hardware costs were more efficient. It was also the first version I installed locally on my computer. It also sparked discussions because Chinese critical content in the online version differs from locally installed environments.

Today I no longer use DeepSeek, but it shows that great AI developments are also taking place in China and virtually all major players like Baidu, Alibaba, etc., already operate their own LLMs.

Mistral

https://mistral.ai/

Mistral is an open-source model particularly suitable for programming applications. Its strengths are:

High Efficiency: Optimized for resource-saving calculations.
Specifically for Developers: Good code generation and completion.
Open-Source Approach: Freely available and customizable.

Perplexity

https://www.perplexity.ai/

Perplexity AI is an AI-powered research and information model specialized in the efficient provision of knowledge. It offers:

Fast Information Retrieval: Ideal for targeted research.
Compact Answers: Summarizes complex topics understandably.
Good Web Integration: Accesses current information.

Perplexity is my replacement for Wikipedia and for all knowledge questions. It provides detailed political contexts and is also a press substitute for me. Perplexity draws on countless sources and LLMs and delivers everything a user could wish for in seconds.

In fact, Perplexity can do everything, including generating code, homework, and much more. What's particularly interesting is that the answers are current and sources are linked.

Generative AI for Image and Art

Artificial intelligence has also transformed the world of images and art. AI models can create realistic photos, abstract artworks, vector graphics, 3D renderings, logos, and much more. The technology is used in various areas, including:

Image Generation: Creation of images from text descriptions.
Photo Editing: AI-powered enhancements and adjustments.
Vector Graphics: Automatic creation and editing of vector images.
3D Modeling: Support for creating complex 3D objects.
Logo Design: Generation of unique logos based on specifications.
Idea Generation: Support for creative processes through AI-generated inspirations.
Image Analysis: Recognition and classification of content in images.

Midjourney

https://www.midjourney.com/

Midjourney is an AI platform that generates impressive, artistic images based on text input. It is characterized by:

High-Quality, Creative Images: Particularly suitable for concept art and design.
Easy Operation: Generates images via Discord commands.
Artistic Freedom: Strong emphasis on style diversity.

DALL·E

https://openai.com/de-DE/index/dall-e-3/

DALL·E by OpenAI is an AI for image generation that creates detailed and realistic images based on text descriptions. Key features:

High Detail Accuracy: Realistic and creative image generation.
Object Linking: Combines different elements logically in an image.
Image Editing: Allows variations and additions to existing images.

Currently, I no longer use DALL-E myself. The functions are integrated into LLMs or do not differ in result from Midjourney. But in the AI world, you should never write off big players, and the next update is sure to come.

Firefly

https://firefly.adobe.com/

Firefly by Adobe is an AI-powered platform for image editing and generation with a focus on creative control. Advantages:

Integration with Adobe Products: Perfect for Photoshop and Illustrator.
Non-Destructive Editing: AI-powered tools for creative adjustments.
Easy Text-to-Image Generation: Creates images from text prompts.

Indispensable because it's integrated into Adobe products. Video generation seems too expensive to me, but that can change quickly. Those who work with Premiere can extend videos, and Photoshop users can also create images or realize InPainting with generative content. Super easy and qualitatively good.

Stable Diffusion

https://stability.ai/news/stable-diffusion-public-release

Stable Diffusion is an open-source model for image generation, particularly suitable for individual customizations. Features:

Fully Customizable: Runs locally and can be modified.
Complex Image Styles: Enables detailed and realistic graphics.
Open-Source Freedom: Free to use and expandable.

Flux

https://flux-ai.io/de/flux-ai-image-generator/

Flux is an innovative AI platform for image art, characterized by experimental spirit and creative algorithms. Its strengths are:

Discover New Styles: Generates unconventional and experimental images.
Strong Algorithms: Uses neural networks for artistic effects.
Creative Workflows: Promotes new approaches to digital art.

Generative AI Videos and 3D

The latest AI models enable not only the creation of static images but also realistic videos and 3D animations. They are used in various areas, including film production, visual effects, game design, and virtual reality. Application areas include:

AI-Generated Video Sequences: Creation of realistic or stylized videos from text descriptions.
Video Editing: Automated optimization of clips, color corrections, and effects.
3D Animation: Generation and control of complex 3D models and animations.
Scene Creation: Creation of complete environments for games, VR, or simulations.

VEO 3

The currently best video generation, including speech and music, produces absolutely realistic videos that will, relatively certainly, determine the next film market. At least in parts, special effects etc. can be created super cheaply and professionally with it.

Runway Gen-2

Runway Gen-2 is a powerful AI tool for video generation and editing. Key features:

Text-to-Video Generation: Generates videos based on text instructions.
Image-to-Video Generation: Generates videos based on uploaded image data.
AI-Powered Editing: Tools for color correction, rotoscoping, and effects.
Easy Application: Intuitive user interface for creative projects.

Runway offers the possibility to create about 20 video sequences for free after registration. Afterwards, you can choose from various subscription models.

Adobe Firefly Video

Since mid-February 2025, the possibilities of image generation in Adobe Firefly have been supplemented by generative AI for video. The service works similarly to Runway and delivers videos based on uploaded images or a prompt. Firefly Video is a powerful AI tool for video generation. Key features:

Text-to-Video Generation: Generates videos based on text instructions.
Image-to-Video Generation: Generates videos based on uploaded image data.
AI-Powered Editing: Tools for color correction, rotoscoping, and effects.
Easy Application: Intuitive user interface for creative projects.

As an Adobe CC subscriber, I had the opportunity to create 2 videos before a payment request (subscription for AI services) was displayed. Adobe is attempting to monetize its AI services additionally for the first time. According to the offer, this probably also includes other generative audio, video, and photo AIs from Adobe.

Sora

Sora by OpenAI is an advanced AI for realistic video generation. It offers:

Detailed Movements: Generates videos with complex physics and realistic movement.
Scene Creation: Creates environments that appear cinematic and immersive.
Automatic Adjustments: Optimizes light, shadows, and textures for better results.

As of mid-February 2025, Sora is not yet available in Germany. However, numerous videos already show how powerful the AI works.

Pika Labs

Pika Labs is an innovative platform for AI-powered video editing and animation. Its strengths are:

Automated Effects: Generation of visual effects from text descriptions.
Animation Control: Control of movement and dynamics of characters and objects.
Intuitive Operation: Simplifies the creative process through automation.

Flux

Flux is a versatile AI platform for artistic and experimental video projects as well as 3D design. Special features:

Artistic Freedoms: Generates unconventional video effects and animations.
3D Modeling: Creates visually impressive and detailed objects.
Innovative Algorithms: Uses advanced neural networks for dynamic effects

Midjourney

As a Midjourney subscriber, I like to use the new ability to generate videos for demo purposes. The quality here is not yet as good as with others, but it also doesn't cost extra. Midjourney is somewhat slower in its development than other AI providers. Therefore, I no longer recommend it, but I will let my annual subscription run out. I created all my music album covers and much more with it, and it does solid work. For websites etc... however, I need more photorealistic, real representations with high resolutions.

Generative AI for Web, SEO, Plugins, and WebApps

Artificial intelligence is changing the way websites are developed, optimized, and managed. From automated content creation to SEO optimization, intelligent plugins, and web apps – AI offers enormous advantages to web developers and content creators.

WordPress and AI-Powered Themes

WordPress remains the world's most widely used Content Management System (CMS). With the increasing integration of AI into themes and plugins, web development becomes more efficient and creative. Particularly noteworthy is DIVI 5, which sets new standards through AI-powered design suggestions, automatic layout adjustments, and smart content analysis.

Automated SEO Optimization with AI

SEO remains a crucial factor for website visibility. AI-powered SEO plugins like RankMath automatically analyze content, suggest relevant keywords, and help improve on-page optimization in real-time. Google also uses AI algorithms like RankBrain to evaluate the relevance of search results.

AI-Powered Plugins and WebApps

In addition to SEO and design, there are numerous AI-powered plugins that optimize the workflow:

WordLift: Uses AI for semantic analysis and improves search engine ranking through structured data.
AI Chatbots: Plugins like Tidio AI or ChatGPT integrations enable intelligent customer interactions.
AI-Generated Content: Tools like ContentBot or Copymatic automatically create engaging blog posts and landing pages.
Image and Media Optimization: Plugins like Imagify AI or Adobe Firefly for Web automatically improve images.

The Future of Web Development with AI

The future belongs to automation: AI can not only make code suggestions to web developers but even generate complete websites. WebApps benefit from personalized user experiences, automatic error detection, and optimized performance.

With the growing AI integration in WordPress, SEO, and WebApps, new opportunities arise to elevate the efficiency and quality of web development to a new level.

AI in Photography and Image Editing

Artificial intelligence has an enormous impact on photography and image editing. From intelligent functions in modern cameras to automated RAW processing with specialized programs – AI saves time and optimizes results.

AI in Modern Cameras

Many current cameras integrate AI-based technologies that support photographers:

Automatic Scene Recognition: Cameras analyze scenes in real-time and select optimal settings for portraits, landscapes, or action shots.
AI-Powered Autofocus Systems: Detection of faces, eyes, and even specific objects for razor-sharp images.
Noise Reduction and HDR Techniques: AI improves image quality already during capture.

AI in RAW Development and Culling

Post-processing photos often takes a lot of time. AI-powered software revolutionizes this process:

Aftershoot: Automates culling (pre-sorting of images), detects duplicate or blurry photos, and offers fast editing functions.
Adobe Lightroom: AI-powered presets, automatic image enhancements, and selective edits with a click.
Topaz Labs (Gigapixel, Sharpen, DeNoise): Extends image optimization possibilities with high-end noise reduction, sharpening, and upscaling.

Advanced Image Editing with AI

In addition to RAW development and sorting, there are numerous other AI functions:

Adobe Photoshop: AI tools like Generative Fill, automatic subject selection, and content-aware retouching.
Luminar Neo: AI filters for sky replacement, skin enhancement, and scene optimization.
Neurapix: Automated color corrections based on individual editing styles.

The Future of AI in Photography

AI will continue to revolutionize photography by:

Intelligent Camera Functions further developed.
Culling and Editing Processes made even more efficient.
New Creative Possibilities opened up through advanced AI-powered image manipulations.

With AI-based solutions, workflows can be optimized, allowing photographers more time for creativity and less for manual editing.

Development and Training Inquiries