Transcript: One thing that I'm interested in about building, given that I have a finger injury right now, and it's on my right hand, which is my dominant hand, is a better input methodology for single-hand use of a computer, specifically geared towards programming and the things that I do on a computer most often. Keyboard shortcuts fortunately get me around a lot, and I can rely on a degree of them with my left hand, but using anything on the right side of the keyboard is a pain. Basically only my thumb is useful at the moment, so that's not great. But it also does simulate me having a stub in a way, just that a thumb is a bit better than a stub, in terms of its smallness. But it is making me wonder about building a hardware device where it effectively has mechanical keyboard switches and a 3D printed keycap. That is huge. where there's maybe one to three buttons. One button would just be to stop and start recording via audio. If that is not good enough in terms of being able to decipher in real time, well, doing the speech-to-text, and then deciding, well, is this an input going directly into a text field, or is this a command or something like that, specifically like command P? And that would need to be very quickly separated from me searching the name of a file. So, yeah. Yeah. Makes me wonder also how Whisper generally would work for programming tests, like Bun install something. You know? I suspect that Whisper is not going to be very good at this, on average. And, well, I guess to some degree it doesn't matter. But if we also did have, like, screen reading functionality, it would be useful to capture some of that information and kind of decipher what I'm thinking. But this is getting a little bit too far. But, yeah, I mean, I think running, like, a local Whisper is probably doable. and then also having it real time is also doable. And getting it to do actions on the computer, I am slightly less sure about. I wonder if just doing it in, like, Python, using, like, MLX is the easiest. I don't know. But that, that might be an interesting way to explore because then I don't have to, like, set up, you know, like a WebSocket server or anything. I can kind of just do, like, real time decoding using MLX, getting the results, potentially feeding them to a local large language model to do, like, function calling support effectively. Like, is this a command or is this text that needs to go into an input field? But some of that also can potentially be done directly in the Whisper decoding part. I think keyword detection is a thing that people do. so determining, like, keyword detection and when to end it, it would be important. But, yeah, thinking about this as an input methodology, and this probably could be useful for other people and also just, like, seems kind of like use a computer of two hands but like doing different things like one hand has been simplified its task like it doesn't actually need to type very much anymore and then it's like well what are you doing with the other hand um and how are you using that to manipulate the computer is kind of like just an interesting train of thought and thought experiment generally speaking um and would be something good to publish a video on most likely um i also do need to keep track of projects and i probably should start keeping track of the zine project so yeah i'm gonna stop this here and start talking about the zine project
Injured and limited to using his non-dominant hand, the author contemplates developing a single-hand input device for programming and other computer tasks. The idea involves a 3D-printed keycap with mechanical keyboard switches, featuring one to three buttons for functions like starting and stopping audio recording, and potentially using speech-to-text for input and commands. He considers the challenges of real-time speech recognition, keyword detection, and integrating the system with a local large language model for effective function calling and text input. The author reflects on the potential utility of this device for others with similar limitations and the feasibility of implementing it using Python and machine learning libraries. The speaker contemplates the idea of using both hands differently on a computer, where one hand performs simplified tasks, and the other is used for more complex manipulations. This concept is seen as an interesting topic for a video. They also mention the need to track their projects, specifically the zine project, which they plan to discuss further.
Future improvements for the system include adding a foot pedal as a USB input device to reduce reliance on the right hand, which is currently unergonomic. The user finds it challenging to perform logical thinking and code-related tasks without using the right hand, possibly due to underutilization of a part of the brain. Other possibilities involve making the system more agential, such as allowing commands to be sent, observed, and modified in real-time. Visual interactions, like popping up and closing windows based on gaze detection, are also intriguing, especially if using Moondream for gaze tracking. The user is eager to test these enhancements with both hands.
The user discusses their experience with Handy, a custom-built external keyboard and program, which helps them use the computer despite a broken finger on their right hand. While Handy has been useful, particularly with typing and voice commands in VS Code, the user faces challenges from not being able to write with their right hand and experiencing bodily discomfort from overuse of their left hand. These physical issues are not addressed by the program.
The individual expresses amazement at their ability to accomplish a lot of work, noting that the primary limitation is their brain rather than the technology at their disposal. Despite the aid of AI, which allows them to perform their usual tasks, they feel mentally constrained. They often desire to write but find it challenging to do so with their left hand, considering it as a potential method to make progress.
82.03% similar
The speaker is reflecting on their experience with making audio burrito posts, noting that it often requires multiple attempts to get into the correct mindset—similar to drafting written posts. They're grappling with the challenge of monologuing without a clear understanding of the audience, as they are aware that at least John and CJ will hear it, but uncertainty about the wider audience affects their ability to communicate effectively. This creates a 'contextual membrane shakiness' as the speaker finds the lack of audience boundaries difficult to navigate, which they recognize may vary among different people. The speaker concludes by deciding to end the current note and start a new one.
81.16% similar
The author contemplates the process of converting an audio note into a transcript, then summarizing it on their "burrito" page. They express a desire to adjust the summarization voice to better represent themselves on the page. Recognizing that this feature may not have widespread appeal, the author nonetheless sees value in providing users with controls to personalize their "burrito." The concept of allowing users to fine-tune their experience is seen as an intriguing possibility.
80.14% similar
I've always been drawn to the peculiar and unexplored, which makes me wonder if I can pepper my writing with a bit of the offbeat—things that don't quite fit the mold. Question is, can I make it work? Ditching the third-person narrative and opting for a chat with you in the first person could make my stories feel more intimate, more like we're in this together. And hey, isn't that what storytelling's all about? Let's find out.
79.36% similar
The speaker is discussing the principles of social design in the context of creating engaging digital spaces, drawing on the collaborative work with Kristen. They emphasize the importance of social participation, challenges, and focused attention in driving user engagement within a product. Kristen's expertise in designing environments for coherence, sense-making, and collaboration is highlighted, particularly in the transition to digital spaces. The speaker believes that fundamental design elements, like those in a burrito, are critical for crafting unique and compelling user experiences in social design.
79.15% similar