314 stories
·
1 follower

A Stealthy Harvard Startup Wants To Reverse Aging in Dogs, and Humans Could Be Next

1 Share
The idea is simple, if you ask biologist George Church. He wants to live to 130 in the body of a 22-year-old. From a report: The world's most influential synthetic biologist is behind a new company that plans to rejuvenate dogs using gene therapy. If it works, he plans to try the same approach in people, and he might be one of the first volunteers. The stealth startup Rejuvenate Bio, cofounded by George Church of Harvard Medical School, thinks dogs aren't just man's best friend but also the best way to bring age-defeating treatments to market. The company, which has carried out preliminary tests on beagles, claims it will make animals "younger" by adding new DNA instructions to their bodies. Its age-reversal plans build on tantalizing clues seen in simple organisms like worms and flies. Tweaking their genes can increase their life spans by double or better. Other research has shown that giving old mice blood transfusions from young ones can restore some biomarkers to youthful levels. "We have already done a bunch of trials in mice and we are doing some in dogs, and then we'll move on to humans," Church told the podcaster Rob Reid earlier this year. The company's efforts to keep its activities out of the press make it unclear how many dogs it has treated so far. In a document provided by a West Coast veterinarian, dated last June, Rejuvenate said its gene therapy had been tested on four beagles with Tufts Veterinary School in Boston. It is unclear whether wider tests are under way. However, from public documents, a patent application filed by Harvard, interviews with investors and dog breeders, and public comments made by the founders, MIT Technology Review assembled a portrait of a life-extension startup pursuing a longevity long shot through the $72-billion-a-year US pet industry. "Dogs are a market in and of themselves," Church said during an event in Boston last week. "It's not just a big organism close to humans. It's something people will pay for, and the FDA process is much faster. We'll do dog trials, and that'll be a product, and that'll pay for scaling up in human trials."

Read more of this story at Slashdot.

Read the whole story
PhaChayFy
9 days ago
reply
Australia
Share this story
Delete

Chromebooks are ready for your next coding project

1 Comment

This year we’re making it possible for you to code on Chromebooks. Whether it’s building an app or writing a quick script, Chromebooks will be ready for your next coding project.

Last year we announced a new generation of Chromebooks that were designed to work with your favorite apps from the Google Play store, helping to bring accessible computing to millions of people around the world. But it’s not just about access to technology, it’s also about access to the tools that create it. And that’s why we’re equipping developers with more tools on Chromebooks.

Pixelbook Android Terminal.jpg

Support for Linux will enable you to create, test and run Android and web app for phones, tablets and laptops all on one Chromebook. Run popular editors, code in your favorite language and launch projects to Google Cloud with the command-line. Everything works directly on a Chromebook.

Linux runs inside a virtual machine that was designed from scratch for Chromebooks. That means it starts in seconds and integrates completely with Chromebook features. Linux apps can start with a click of an icon, windows can be moved around, and files can be opened directly from apps.

A preview of the new tool will be released on Google Pixelbook soon. Remember to tune in to Google I/O to learn more about Linux on Chromebooks, as well as more exciting announcements.

Read the whole story
PhaChayFy
9 days ago
reply
When?!?
Australia
Share this story
Delete

Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone

1 Share


A long-standing goal of human-computer interaction has been to enable people to have a natural conversation with computers, as they would with each other. In recent years, we have witnessed a revolution in the ability of computers to understand and to generate natural speech, especially with the application of deep neural networks (e.g., Google voice search, WaveNet). Still, even with today’s state of the art systems, it is often frustrating having to talk to stilted computerized voices that don't understand natural language. In particular, automated phone systems are still struggling to recognize simple words and commands. They don’t engage in a conversation flow and force the caller to adjust to the system instead of the system adjusting to the caller.

Today we announce Google Duplex, a new technology for conducting natural conversations to carry out “real world” tasks over the phone. The technology is directed towards completing specific tasks, such as scheduling certain types of appointments. For such tasks, the system makes the conversational experience as natural as possible, allowing people to speak normally, like they would to another person, without having to adapt to a machine.

One of the key research insights was to constrain Duplex to closed domains, which are narrow enough to explore extensively. Duplex can only carry out natural conversations after being deeply trained in such domains. It cannot carry out general conversations.

Here are examples of Duplex making phone calls (using different voices):
Duplex scheduling a hair salon appointment:
Duplex calling a restaurant:

While sounding natural, these and other examples are conversations between a fully automatic computer system and real businesses.

The Google Duplex technology is built to sound natural, to make the conversation experience comfortable. It’s important to us that users and businesses have a good experience with this service, and transparency is a key part of that. We want to be clear about the intent of the call so businesses understand the context. We’ll be experimenting with the right approach over the coming months.

Conducting Natural Conversations
There are several challenges in conducting natural conversations: natural language is hard to understand, natural behavior is tricky to model, latency expectations require fast processing, and generating natural sounding speech, with the appropriate intonations, is difficult.

When people talk to each other, they use more complex sentences than when talking to computers. They often correct themselves mid-sentence, are more verbose than necessary, or omit words and rely on context instead; they also express a wide range of intents, sometimes in the same sentence, e.g., “So umm Tuesday through Thursday we are open 11 to 2, and then reopen 4 to 9, and then Friday, Saturday, Sunday we... or Friday, Saturday we're open 11 to 9 and then Sunday we're open 1 to 9.”
Example of complex statement:

In natural spontaneous speech people talk faster and less clearly than they do when they speak to a machine, so speech recognition is harder and we see higher word error rates. The problem is aggravated during phone calls, which often have loud background noises and sound quality issues.

In longer conversations, the same sentence can have very different meanings depending on context. For example, when booking reservations “Ok for 4” can mean the time of the reservation or the number of people. Often the relevant context might be several sentences back, a problem that gets compounded by the increased word error rate in phone calls.

Deciding what to say is a function of both the task and the state of the conversation. In addition, there are some common practices in natural conversations — implicit protocols that include elaborations (“for next Friday” “for when?” “for Friday next week, the 18th.”), syncs (“can you hear me?”), interruptions (“the number is 212-” “sorry can you start over?”), and pauses (“can you hold? [pause] thank you!” different meaning for a pause of 1 second vs 2 minutes).

Enter Duplex
Google Duplex’s conversations sound natural thanks to advances in understanding, interacting, timing, and speaking.

At the core of Duplex is a recurrent neural network (RNN) designed to cope with these challenges, built using TensorFlow Extended (TFX). To obtain its high precision, we trained Duplex’s RNN on a corpus of anonymized phone conversation data. The network uses the output of Google’s automatic speech recognition (ASR) technology, as well as features from the audio, the history of the conversation, the parameters of the conversation (e.g. the desired service for an appointment, or the current time of day) and more. We trained our understanding model separately for each task, but leveraged the shared corpus across tasks. Finally, we used hyperparameter optimization from TFX to further improve the model.
Incoming sound is processed through an ASR system. This produces text that is analyzed with context data and other inputs to produce a response text that is read aloud through the TTS system.
Duplex handling interruptions:
Duplex elaborating:
Duplex responding to a sync:

Sounding Natural
We use a combination of a concatenative text to speech (TTS) engine and a synthesis TTS engine (using Tacotron and WaveNet) to control intonation depending on the circumstance.

The system also sounds more natural thanks to the incorporation of speech disfluencies (e.g. “hmm”s and “uh”s). These are added when combining widely differing sound units in the concatenative TTS or adding synthetic waits, which allows the system to signal in a natural way that it is still processing. (This is what people often do when they are gathering their thoughts.) In user studies, we found that conversations using these disfluencies sound more familiar and natural.

Also, it’s important for latency to match people’s expectations. For example, after people say something simple, e.g., “hello?”, they expect an instant response, and are more sensitive to latency. When we detect that low latency is required, we use faster, low-confidence models (e.g. speech recognition or endpointing). In extreme cases, we don’t even wait for our RNN, and instead use faster approximations (usually coupled with more hesitant responses, as a person would do if they didn’t fully understand their counterpart). This allows us to have less than 100ms of response latency in these situations. Interestingly, in some situations, we found it was actually helpful to introduce more latency to make the conversation feel more natural — for example, when replying to a really complex sentence.

System Operation
The Google Duplex system is capable of carrying out sophisticated conversations and it completes the majority of its tasks fully autonomously, without human involvement. The system has a self-monitoring capability, which allows it to recognize the tasks it cannot complete autonomously (e.g., scheduling an unusually complex appointment). In these cases, it signals to a human operator, who can complete the task.

To train the system in a new domain, we use real-time supervised training. This is comparable to the training practices of many disciplines, where an instructor supervises a student as they are doing their job, providing guidance as needed, and making sure that the task is performed at the instructor’s level of quality. In the Duplex system, experienced operators act as the instructors. By monitoring the system as it makes phone calls in a new domain, they can affect the behavior of the system in real time as needed. This continues until the system performs at the desired quality level, at which point the supervision stops and the system can make calls autonomously.

Benefits for Businesses and Users
Businesses that rely on appointment bookings supported by Duplex, and are not yet powered by online systems, can benefit from Duplex by allowing customers to book through the Google Assistant without having to change any day-to-day practices or train employees. Using Duplex could also reduce no-shows to appointments by reminding customers about their upcoming appointments in a way that allows easy cancellation or rescheduling.
Duplex calling a restaurant:

In another example, customers often call businesses to inquire about information that is not available online such as hours of operation during a holiday. Duplex can call the business to inquire about open hours and make the information available online with Google, reducing the number of such calls businesses receive, while at the same time, making the information more accessible to everyone. Businesses can operate as they always have, there’s no learning curve or changes to make to benefit from this technology.
Duplex asking for holiday hours:

For users, Google Duplex is making supported tasks easier. Instead of making a phone call, the user simply interacts with the Google Assistant, and the call happens completely in the background without any user involvement.
A user asks the Google Assistant for an appointment, which the Assistant then schedules by having Duplex call the business.
Another benefit for users is that Duplex enables delegated communication with service providers in an asynchronous way, e.g., requesting reservations during off-hours, or with limited connectivity. It can also help address accessibility and language barriers, e.g., allowing hearing-impaired users, or users who don’t speak the local language, to carry out tasks over the phone.

This summer, we’ll start testing the Duplex technology within the Google Assistant, to help users make restaurant reservations, schedule hair salon appointments, and get holiday hours over the phone.
Yaniv Leviathan, Google Duplex lead, and Matan Kalman, engineering manager on the project, enjoying a meal booked through a call from Duplex.
Duplex calling to book the above meal:

Allowing people to interact with technology as naturally as they interact with each other has been a long standing promise. Google Duplex takes a step in this direction, making interaction with technology via natural conversation a reality in specific scenarios. We hope that these technology advances will ultimately contribute to a meaningful improvement in people’s experience in day-to-day interactions with computers.
Read the whole story
PhaChayFy
9 days ago
reply
Australia
Share this story
Delete

knot

1 Share


knot

Read the whole story
PhaChayFy
17 days ago
reply
Australia
Share this story
Delete

Mental Health On A Budget

3 Shares

Everyone knows medical care in the US is expensive even with insurance and prohibitively expensive without it. I have a lot of patients who are uninsured, or who bounce on and off insurance, or who have trouble affording their co-pays. This is a collection of tricks I’ve learned (mostly from them) to help deal with these situations. They are US-based and may not apply to other countries. Within the US, they are a combination of legal and probably-legal; I’ve tried to mark which is which but I am not a lawyer and can’t make promises. None of this is medical advice; use at your own risk.

This is intended for people who already know they do not qualify for government assistance. If you’re not sure, check HealthCare.gov and look into the particular patchwork of assistance programs in your state and county.

I. Prescription Medication

This section is about ways to get prescription medication for cheaper. If even after all this your prescription medication is too expensive, please talk to your doctor about whether it can be replaced with a less expensive medication. Often doctors don’t think about this and will be happy to work with you if they know you need it. They may also have other ways to help you save money, like giving you the free sample boxes they get from drug reps.

1. Sites like GoodRx.com. This is first because it’s probably the most important thing most people can do to save money on health care. For example, one month of Abilify 5 mg usually costs $930 at Safeway, but only $30 with a GoodRx coupon. There is no catch. Insurances and pharmacies play a weird game where insurances say they’ll only pay one-tenth the sticker price for drugs, and pharmacies respond by dectupling the price of everything. If you have insurance, it all (mostly) cancels out in the end; if you don’t, you end up paying inflated prices with no relation to reality. GoodRx negotiates discounts so that individual consumers can get drugs for the same discounted price as insurances (or better); they also list the prices at each pharmacy so you know where to shop. This is not only important in and of itself, but its price comparison feature is also important to figure out how best to apply the other features in this category. Even if you have insurance, GoodRx prices are sometimes lower than your copay.

2. Get and split bigger pills. Remember how a month of Abilify 5 mg cost $30 with the coupon? Well, a month of Abilify 30 mg also costs $30. Cut each 30 mg pill into sixths, and now you have six months’ worth of Abilify 5 mg, for a total cost of $5 per month. You’ll need a cooperative doctor willing to prescribe you the higher dose. Note that some pills cannot be divided in this way – cutting XR pills screws up the extended release mechanism. Others like seizure medication are a bad idea to split in case you end up taking slightly different doses each time. Ask your doctor whether this is safe for whatever medication you use. Do not ask the pharma companies or trust their literature – they will always say it’s unsafe, for self-interested reasons. Contrary to some doctors’ concerns, this is not insurance fraud if you’re not buying it with insurance, and AFAIK there’s no such thing as defrauding a pharmacy.

3. Mail order from Canada. Canada has lower prices than the US for various prescription drugs. Canadian pharmacies are unlicensed and illegitimate and you should never use them, according to the same people who tell you that marijuana is a gateway drug and porn will fill your computer with Russian viruses. According to everyone else, including most doctors I know, they are fine as long as you avoid obvious scams. They are technically illegal but the FDA has a policy not to prosecute people who buy drugs there for personal use. The Canadian Internet Pharmacy Association maintains a list of ones they consider safe. If I try really hard, I can find a way to get the month of Abilify 5 mg for $4.58 from canadapharmacy.com, but this isn’t really that much better than the best American option. Some other medications do seem to be better, especially ones that are still on patent; if I want a month of Saphris 10 mg, the best I can find on GoodRx is $620, but on canadapharmacy.com there’s a deal for $196.

4. Pharma company patient assistance programs. As part of their continuing effort to pretend they are anything other than soulless profit-maximizing bloodsuckers who will be first against the wall when the revolution comes, some pharma companies offer their drugs for cheap if you can prove you need them and can’t afford the regular price. These are most useful if for some reason you need a specific expensive brand-name drug; if you have any other options you’re better off just buying the generic. You can search for these programs at Partnership For Prescription Assistance, RXAssist, and NeedyMeds. Be very careful to read the fine print on these, because no matter what they pretend, drug companies are soulless profit-maximizing bloodsuckers who will be first against the wall when the revolution comes, and sometimes these are just small discounts that aren’t as good as using one of the other methods. Occasionally a company will give you a great discount that knocks a brand-name medication costing $300 down to only $150 without telling you that there is a similar generic that costs $5. But if you need one specific very expensive thing, and you are lowish-income, and you don’t have government help, this is still your best bet.

5. Get 90+ day supplies. If your insurance charges you a co-pay of $30 per prescription, and you get a 90-day supply instead of a one month supply, then you’re paying $30 once every three months, instead of once a month.

II. Therapy

This section is on ways to do therapy if you cannot afford a traditional therapist. There may also be other options specific to your area, like training clinics attached to colleges that charge “sliding scale” fees (ie they will charge you less if you can’t afford full price).

1. Bibliotherapy: If you’re doing a specific therapy for a specific problem (as opposed to just trying to vent or organize your thoughts), studies generally find that doing therapy out of a textbook works just as well as doing it with a real therapist. I usually recommend David Burns’ therapy books: Feeling Good for depression and When Panic Attacks for anxiety. If you have anger, emotional breakdowns, or other borderline-adjacent symptoms, consider a DBT skills workbook. For OCD, Brain Lock.

2. Free support groups: Alcoholics Anonymous is neither as great as the proponents say nor as terrible as the detractors say; for a balanced look, see here. There are countless different spinoffs for non-religious people or people with various demographic characteristics or different drugs. But there are also groups for gambling addiction, sex addiction, and food addiction (including eating disorders). There’s a list of anxiety and depression support groups here. Groups for conditions like social anxiety can be especially helpful since going to the group is itself a form of exposure therapy.

3. Therapy startups: These are companies like BetterHelp and TalkSpace which offer remote therapy for something like $50/week. I was previously more bullish on these; more recently, it looks like they have stopped offering free videochat with a subscription. That means you may be limited to texting your therapist about very specific things you are doing that day, which isn’t really therapy. And some awful thinkpiece sites that always hate everything are also skeptical. I am interested in hearing experiences from anyone who has used these sites. Until then, consider them use-at-your-own-risk.

III. Supplement Analogues

This section is for people who can no longer afford to see a doctor to get their prescription medication. It discusses what supplements are most similar to prescription medications. This is not an endorsement of these substitutions as exactly as good as the medications they are replacing, a recommendation to switch even if you can still get the original medication, or a guarantee that you won’t go into withdrawal if you switch to these. They’re just better than nothing. Make sure to get these from a trusted supplier. I trust this site, but do your own investigation.

This doesn’t include detailed description of doses, side effects, or interactions; you will have to look these up yourself. These are all either legal, or in a gray area of “probably legal” consistent with them being very widely used without punishment. I am not including illegal options, even though some of them are clearly stronger than these – but you can probably find them if you search.

1. Similar to SSRIs: 5-HTP. This is a serotonin precursor that can serve some of the same roles that selective serotonin reuptake inhibitors do, though this is still controversial and it is probably not as strong. Cochrane Review thinks that “evidence does suggest these substances are better than placebo at alleviating depression”. This may plausibly help with SSRI withdrawal, though not as much as going back on an SSRI. It can be dangerous if you are taking any other serotonergic medication, so check with the doctor prescribing it first. Cost is about $10/month. Definitely legal.

2. Similar to antidepressants in general: Tianeptine. This is a European antidepressant which is unregulated in the US, making it the only way I know to get an regulatory-agency-approved antidepressant without a prescription. Look up the difference between the sodium and the sulfate versions before you buy. Generally safe at the standard dose; higher doses carry a risk of addiction. Cost is about $20/month. Probably legal, widely used without legal challenges.

3. Similar to stimulants: Adrafinil. This is the prodrug of modafinil, a stimulant-ish medication widely used off-label for ADHD. Modafinil itself is Schedule IV controlled (though widely available online); adrafinil is unscheduled and also widely available. Look up the debate over liver safety before you use. Cost is about $30/month. Probably legal, widely used without legal challenges.

4. Similar to anxiety medications: GABA and picamilon. GABA is an endogenous inhibitory neurotransmitter, but it has questionable ability to cross the blood-brain barrier when taken orally (though see here for counterargument). Picamilon is the same neurotransmittor attached to a niacin molecule that helps it cross the BBB more readily. Both are sold as supplements. The evidence base is weak, and this is the entry on this list I am most skeptical of. Use at your own risk (of it not working; it’s probably pretty safe). Neither of these is as strong as a benzodiazepine and these will not significantly relieve acute benzodiazepine withdrawal. Cost is about $30/month. GABA is definitely legal. Picamilon is possibly legal; the FDA has tried to stop companies from selling it as a dietary supplement, but does not seem to be challenging users.

You can find a discussion of other supplements for depression at Part IV here and for anxiety at Part IV here. You can find a discussion of ways that supplements can play a very minor role in helping with psychosis in the second to last paragraph of Part 12 here, but please don’t rely on this. I no longer 100% endorse everything in those lists.

If you know other safe and legal ways to save money on psychiatric care, please mention them in the comments and I’ll add them as they come up.

Read the whole story
PhaChayFy
17 days ago
reply
Australia
Share this story
Delete

Expressive Speech Synthesis with Tacotron

1 Share


At Google, we're excited about the recent rapid progress of neural network-based text-to-speech (TTS) research. In particular, end-to-end architectures, such as the Tacotron systems we announced last year, can both simplify voice building pipelines and produce natural-sounding speech. This will help us build better human-computer interfaces, like conversational assistants, audiobook narration, news readers, or voice design software. To deliver a truly human-like voice, however, a TTS system must learn to model prosody, the collection of expressive factors of speech, such as intonation, stress, and rhythm. Most current end-to-end systems, including Tacotron, don't explicitly model prosody, meaning they can't control exactly how the generated speech should sound. This may lead to monotonous-sounding speech, even when models are trained on very expressive datasets like audiobooks, which often contain character voices with significant variation. Today, we are excited to share two new papers that address these problems.

Our first paper, “Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron”, introduces the concept of a prosody embedding. We augment the Tacotron architecture with an additional prosody encoder that computes a low-dimensional embedding from a clip of human speech (the reference audio).
We augment Tacotron with a prosody encoder. The lower half of the image is the original Tacotron sequence-to-sequence model. For technical details, please refer to the paper.
This embedding captures characteristics of the audio that are independent of phonetic information and idiosyncratic speaker traits — these are attributes like stress, intonation, and timing. At inference time, we can use this embedding to perform prosody transfer, generating speech in the voice of a completely different speaker, but exhibiting the prosody of the reference.

Text: *Is* that Utah travel agency?
Reference prosody (Australian)
Synthesized without prosody embedding (American)
Synthesized with prosody embedding (American)

The embedding can also transfer fine time-aligned prosody from one phrase to a slightly different phrase, though this technique works best when the reference and target phrases are similar in length and structure.

Reference Text: For the first time in her life she had been danced tired.
Synthesized Text: For the last time in his life he had been handily embarrassed.
Reference prosody (American)
Synthesized without prosody embedding (American)
Synthesized with prosody embedding (American)

Excitingly, we observe prosody transfer even when the reference audio comes from a speaker whose voice is not in Tacotron's training data.

Text: I've Swallowed a Pollywog.
Reference prosody (Unseen American Speaker)
Synthesized without prosody embedding (British)
Synthesized with prosody embedding (British)

This is a promising result, as it paves the way for voice interaction designers to use their own voice to customize speech synthesis. You can listen to the full set of audio demos for “Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron” on this web page.

Despite their ability to transfer prosody with high fidelity, the embeddings from the paper above don't completely disentangle prosody from the content of a reference audio clip. (This explains why they transfer prosody best to phrases of similar structure and length.) Furthermore, they require a clip of reference audio at inference time. A natural question then arises: can we develop a model of expressive speech that alleviates these problems?

In our second paper, “Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis”, we do just that. Building upon the architecture in our first paper, we propose a new unsupervised method for modeling latent "factors" of speech. The key to this model is that, rather than learning fine time-aligned prosodic elements, it learns higher-level speaking style patterns that can be transferred across arbitrarily different phrases.

The model works by adding an extra attention mechanism to Tacotron, forcing it to represent the prosody embedding of any speech clip as the linear combination of a fixed set of basis embeddings. We call these embeddings Global Style Tokens (GSTs), and find that they learn text-independent variations in a speaker's style (soft, high-pitch, intense, etc.), without the need for explicit style labels.
Model architecture of Global Style Tokens. The prosody embedding is decomposed into “style tokens” to enable unsupervised style control and transfer. For technical details, please refer to the paper.
At inference time, we can select or modify the combination weights for the tokens, allowing us to force Tacotron to use a specific speaking style without needing a reference audio clip. Using GSTs, for example, we can make different sentences of varying lengths sound more "lively", "angry", "lamenting", etc:

Text: United Airlines five six three from Los Angeles to New Orleans has Landed.
Style 1
Style 2
Style 3
Style 4
Style 5
The text-independent nature of GSTs make them ideal for style transfer, which takes a reference audio clip spoken in a specific style and transfers its style to any target phrase we choose. To achieve this, we first run inference to predict the GST combination weights for an utterance whose style we want to imitate. We can then feed those combination weights to the model to synthesize completely different phrases — even those with very different lengths and structure — in the same style.

Finally, our paper shows that Global Style Tokens can model more than just speaking style. When trained on noisy YouTube audio from unlabeled speakers, a GST-enabled Tacotron learns to represent noise sources and distinct speakers as separate tokens. This means that by selecting the GSTs we use in inference, we can synthesize speech free of background noise, or speech in the voice of a specific unlabeled speaker from the dataset. This exciting result provides a path towards highly scalable but robust speech synthesis. You can listen to the full set of demos for "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis" on this web page.

We are excited about the potential applications and opportunities that these two bodies of research enable. In the meantime, there are new important research problems to be addressed. We'd like to extend the techniques of the first paper to support prosody transfer in the natural pitch range of the target speaker. We'd also like to develop techniques to select appropriate prosody or speaking style automatically from context, using, for example, the integration of natural language understanding with TTS. Finally, while our first paper proposes an initial set of objective and subjective metrics for prosody transfer, we'd like to develop these further to help establish generally-accepted methods for prosodic evaluation.

Acknowledgements
These projects were done jointly between multiple Google teams. Contributors include RJ Skerry-Ryan, Yuxuan Wang, Daisy Stanton, Eric Battenberg, Ying Xiao, Joel Shor, Rif A. Saurous, Yu Zhang, Ron J. Weiss, Rob Clark, Fei Ren and Ye Jia.


Read the whole story
PhaChayFy
45 days ago
reply
Australia
Share this story
Delete
Next Page of Stories