Τεχνικές αυτόματης μίξης κατά την ηχογράφηση και αναπαραγωγή χωρικά εκτεταμένης ηχητικής σκηνής.

Ρουμελιώτης, Σπυρίδων; Roumeliotis, Spyridon

Automatic mixing techniques for capturing and reproduction of a spatially extended sound scene.

Στοιχεία Dublin Core

dc.creator	Ρουμελιώτης, Σπυρίδων	el
dc.creator	Roumeliotis, Spyridon	en
dc.date.accessioned	2023-07-20T07:00:40Z
dc.date.available	2023-07-20T07:00:40Z
dc.date.issued	2023-07-20
dc.identifier.uri	http://hdl.handle.net/20.500.12688/10677
dc.description.abstract	Κατά την ηχητική κάλυψη ή ηχογράφηση εκτεταμένης χωρικής περιοχής, ενδείκνυται η χρήση πολλαπλών μικροφώνων τα οποία είναι κατανεμημένα στο χώρο. Η μίξη όμως και επεξεργασία των ηχητικών καταγραφών για τη δημιουργία του τελικού ηχητικού υλικού αποτελεί σημαντική πρόκληση, καθότι πρέπει να αντιμετωπιστούν προβλήματα όπως ο θόρυβος και η αντήχηση, οι φασματικοί χρωματισμοί και οι ανεπιθύμητες διακυμάνσεις της ηχητικής έντασης. Στην εργασία αυτή υλοποιήθηκαν τεχνικές πολυκάναλης επεξεργασίας σήματος με στόχο την αυτοματοποίηση της μίξης για την παραγωγή του τελικού ηχητικού υλικού με όσο το δυνατόν λιγότερη παρέμβαση από τον χρήστη. Οι τεχνικές υλοποιήθηκαν σε συνθήκες προσομοίωσης με χρήση σήματος φωνής ενώ για την αξιολόγησή τους αξιοποιήθηκε μετρική βασισμένη στο κριτήριο του Perceptual Evaluation of Speech Quality (PESQ). Μέσα από ένα σύνολο από πρώτα πειράματα, αναδεικνύεται καταρχάς η εγκυρότητα του PESQ ως ένα αντικειμενικό κριτήριο για την ποσοτικοποίηση της υποβάθμισης της ποιότητας του σήματος φωνής σε ένα μεγάλο εύρος από συνθήκες αντήχησης και θορύβου. Ένα δεύτερο σύνολο πειραμάτων υλοποιήθηκε με σκοπό τη σύγκριση διαφορετικων τεχνικών ηχητικής μίξης σε ένα περιβάλλον που προσομοιώνει θεατρική παράσταση. Τα αποτελέσματα αναδυκνείουν την υπεροχή της γνωστής τεχνικής delay and sum έναντι άλλων τεχνικών που βασίζονται σε αξιοποίηση ενός υποσυνόλου των διαθέσιμων μικροφώνων. Επιπλέον τεχνικές που αναδεικνύονται - όχι μόνο λόγω της επίδοσής τους αλλά και λόγω της ευκολης υλοποίησής τους - είναι μια προσέγγιση που αξιοποιεί ζεύγος μικροφώνων βασισμένο στη μέτρηση του Magnitude Squared Coherence (MSC) αλλά και η επιλογή του κοντινότερου στην πηγή μικροφώνου.	el
dc.description.abstract	When capturing a spatially extended sound scene, it is advisable to use multiple microphones distributed in space. However, the mixing and processing of the multiple audio channels is a challenging process due to problems such as noise, reverberation, spectral coloration and unwanted power variations. In this thesis, multichannel signal processing techniques are examined as the means to automate the mixing process, so as to produce an acoustic representation of the sound scene with as little intervention from the user as possible. The techniques are implemented inside a simulated environment using dry speech signals as input, while the evaluation of the overall performance is based on the well-known metric of Perceptual Evaluation of Speech Quality (PESQ). A first series of experiments is deployed with the goal to verify the appropriateness of PESQ for measuring quality degradation in the audio signal across a wide range of noise and reverberation conditions. In a second round of experiments, the different mixing techniques are put into test in a scenario that simulates theatrical performance. The results indicate that the well-known delay-and-sum technique achieves the best performance in terms of PESQ, by exploiting all the available audio channels. However, two additional approaches that provide competitive performance and that are quite interesting due to the fact that they can be easily implemented are; an approach that utilizes a pair of microphones based on a measure of the magnitude squared coherence (MSC) and a much simpler approach that activates only a single microphone, the one that is closest to the sound source.	en
dc.language	Ελληνικά	el
dc.language	Greek	en
dc.publisher	ΕΛ.ΜΕ.ΠΑ., Σχολή Μουσικής και Οπτοακουστικών Τεχνολογιών (ΣΜΟΤ), ΠΜΣ Τεχνολογίες Ήχου και Μουσικής	el
dc.publisher	H.M.U., School of Music and Optoacoustic Technologies (SMOT), MSc in Sound and Music Technologies	en
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.title	Τεχνικές αυτόματης μίξης κατά την ηχογράφηση και αναπαραγωγή χωρικά εκτεταμένης ηχητικής σκηνής.	el
dc.title	Automatic mixing techniques for capturing and reproduction of a spatially extended sound scene.	en

Στοιχεία healMeta

heal.creatorName	Ρουμελιώτης, Σπυρίδων	el
heal.creatorName	Roumeliotis, Spyridon	en
heal.publicationDate	2023-07-20
heal.identifier.primary	http://hdl.handle.net/20.500.12688/10677
heal.abstract	Κατά την ηχητική κάλυψη ή ηχογράφηση εκτεταμένης χωρικής περιοχής, ενδείκνυται η χρήση πολλαπλών μικροφώνων τα οποία είναι κατανεμημένα στο χώρο. Η μίξη όμως και επεξεργασία των ηχητικών καταγραφών για τη δημιουργία του τελικού ηχητικού υλικού αποτελεί σημαντική πρόκληση, καθότι πρέπει να αντιμετωπιστούν προβλήματα όπως ο θόρυβος και η αντήχηση, οι φασματικοί χρωματισμοί και οι ανεπιθύμητες διακυμάνσεις της ηχητικής έντασης. Στην εργασία αυτή υλοποιήθηκαν τεχνικές πολυκάναλης επεξεργασίας σήματος με στόχο την αυτοματοποίηση της μίξης για την παραγωγή του τελικού ηχητικού υλικού με όσο το δυνατόν λιγότερη παρέμβαση από τον χρήστη. Οι τεχνικές υλοποιήθηκαν σε συνθήκες προσομοίωσης με χρήση σήματος φωνής ενώ για την αξιολόγησή τους αξιοποιήθηκε μετρική βασισμένη στο κριτήριο του Perceptual Evaluation of Speech Quality (PESQ). Μέσα από ένα σύνολο από πρώτα πειράματα, αναδεικνύεται καταρχάς η εγκυρότητα του PESQ ως ένα αντικειμενικό κριτήριο για την ποσοτικοποίηση της υποβάθμισης της ποιότητας του σήματος φωνής σε ένα μεγάλο εύρος από συνθήκες αντήχησης και θορύβου. Ένα δεύτερο σύνολο πειραμάτων υλοποιήθηκε με σκοπό τη σύγκριση διαφορετικων τεχνικών ηχητικής μίξης σε ένα περιβάλλον που προσομοιώνει θεατρική παράσταση. Τα αποτελέσματα αναδυκνείουν την υπεροχή της γνωστής τεχνικής delay and sum έναντι άλλων τεχνικών που βασίζονται σε αξιοποίηση ενός υποσυνόλου των διαθέσιμων μικροφώνων. Επιπλέον τεχνικές που αναδεικνύονται - όχι μόνο λόγω της επίδοσής τους αλλά και λόγω της ευκολης υλοποίησής τους - είναι μια προσέγγιση που αξιοποιεί ζεύγος μικροφώνων βασισμένο στη μέτρηση του Magnitude Squared Coherence (MSC) αλλά και η επιλογή του κοντινότερου στην πηγή μικροφώνου.	el
heal.abstract	When capturing a spatially extended sound scene, it is advisable to use multiple microphones distributed in space. However, the mixing and processing of the multiple audio channels is a challenging process due to problems such as noise, reverberation, spectral coloration and unwanted power variations. In this thesis, multichannel signal processing techniques are examined as the means to automate the mixing process, so as to produce an acoustic representation of the sound scene with as little intervention from the user as possible. The techniques are implemented inside a simulated environment using dry speech signals as input, while the evaluation of the overall performance is based on the well-known metric of Perceptual Evaluation of Speech Quality (PESQ). A first series of experiments is deployed with the goal to verify the appropriateness of PESQ for measuring quality degradation in the audio signal across a wide range of noise and reverberation conditions. In a second round of experiments, the different mixing techniques are put into test in a scenario that simulates theatrical performance. The results indicate that the well-known delay-and-sum technique achieves the best performance in terms of PESQ, by exploiting all the available audio channels. However, two additional approaches that provide competitive performance and that are quite interesting due to the fact that they can be easily implemented are; an approach that utilizes a pair of microphones based on a measure of the magnitude squared coherence (MSC) and a much simpler approach that activates only a single microphone, the one that is closest to the sound source.	en
heal.language	Ελληνικά	el
heal.language	Greek	en
heal.academicPublisher	ΕΛ.ΜΕ.ΠΑ., Σχολή Μουσικής και Οπτοακουστικών Τεχνολογιών (ΣΜΟΤ), ΠΜΣ Τεχνολογίες Ήχου και Μουσικής	el
heal.academicPublisher	H.M.U., School of Music and Optoacoustic Technologies (SMOT), MSc in Sound and Music Technologies	en
heal.title	Τεχνικές αυτόματης μίξης κατά την ηχογράφηση και αναπαραγωγή χωρικά εκτεταμένης ηχητικής σκηνής.	el
heal.title	Automatic mixing techniques for capturing and reproduction of a spatially extended sound scene.	en
heal.type	Μεταπτυχιακή Διατριβή	el
heal.type	Master thesis	en
heal.keyword	αυτόματη μίξη, ηχογράφηση	el
heal.keyword	automatic mixing, sound recording	en
heal.access	free	el
heal.advisorName	Στεφανάκης, Νικόλαος	el
heal.advisorName	Stefanakis, Nikolaos	en
heal.academicPublisherID	ΕΛ.ΜΕ.ΠΑ. Ελληνικό Μεσογειακό Πανεπιστήμιο	el
heal.academicPublisherID	Η.Μ.U Hellenic Mediterranean University‎	en
heal.fullTextAvailability	true	el
tcd.distinguished	false	el
tcd.survey	false	el

Αρχεία σε αυτό το τεκμήριο

Name:: RoumeliotisSpyridon2023.pdf
Μέγεθος:: 7.826Mb
Τύπος:: PDF

Προβολή/Άνοιγμα

Name:: license_rdf
Μέγεθος:: 1.203Kb
Τύπος:: application/rdf+xml

Προβολή/Άνοιγμα

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Μεταπτυχιακές εργασίες / Master Theses [397]

Εμφάνιση απλής εγγραφής

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States