• 1 Post
  • 46 Comments
Joined 2 years ago
cake
Cake day: June 4th, 2023

help-circle
  • These books were purchased by them before being destroyed in the scanning process. I fail to see the issue with this specific case. Lots of artists buy stuff and irreversibly modify it. Are we going to be angry now at people who glue their puzzles or use parts of books for scrapbooking? If these were unique works there would be an issue, but I don’t think that truly unique pieces would be in their target group, as the destructive scanning is all about cost cutting and unique works cost a lot of money that they wouldn’t just destroy.

    The fact that they use it for model training and later sell access to that model’s work is the shady part that has a severe whiff of plagiarism to it.



  • If I were to ask my Magic 8 Ball “Is the word ‘difinitely’ misspelled?” 100 times, it’s going to reply in the affirmative over 16% of the time.

    This comparison makes no sense. Your example has a binary question. In that case, any system that replies correctly at even a rate of around 50% would be useless. However, the problem space in this scenario is way larger than 2 options and still way larger than 100 options. Being correct in even a small number of 100 attempts is still statistically significant.

    The fact that an LLM is unable to reason and that it is based on statistics doesn’t change anything about this behavior. At the end of the day you get a tool that is able to point you to actual new information that you by yourself did not arrive at.

    Imagine that you put a lot of effort in a better model specifically for vulnerability research and you get it up to a correctness rate of a mere 10%. I would gladly hire some programmers to sift through these reports and possibly find overlooked vulnerabilities.


  • This is literally the very first experiment in this use case, done by a single person on a model that wasn’t specifically designed for this. The fact that it is able to formulate a correct response at all in this situation impresses me.

    It would be easy to criticize this if it were the endpoint and this was being advertised as a tool for vulnerability research, but as discussed at the end of the post, this “quick little test” shows both initial promising results and had the fortunate byproduct of actually revealing a new vulnerability. By no means is it implied that it is now ready for use in this field.

    The issue with hallucinations is one that in my opinion is never going to be totally fixed. That is why I hate the use of AI as a final arbiter of truth, which is sadly how a lot of people use it (I’ll quickly ask ChatGPT) and companies advertise it. What it is good at however, is coming up with plausible ideas, and in this case having an indication for things to check in code can be a great tool to discover new stuff, as is literally the case for this security researcher finding a new vulnerability after auditing the module themselves.


  • I hate AI. Why?

    • Because of its extreme energy consumption compared to what it achieves
    • Because it is all in the hands of the worst companies on this planet
    • Because capitalists are foaming at the mouth to use it to fuck over workers
    • Because it is devaluing art and reducing it to another commodity to “produce”

    However

    I also took the time to read the original blog post, and it is a fascinating story.

    The author starts out with using an existing vulnerability as a benchmark for ChatGPT testing. They describe how they took the code specific to the vulnerability and packaged it for ChatGPT, how they formatted the query and what their results were. In 100 runs only 8 correctly identify the targeted vulnerability, the rest are false positives or claim that there are no vulnerabilities in the given code.

    Then they take their test a step further and increase the amount of code shared with ChatGPT so that it also includes stuff of the module that had nothing to do with the original vulnerability. As expected, this larger input decreases performance and also reduces the vulnerability detection rate for the targeted vulnerability. However, in those 100 runs, another vulnerability was described that wasn’t a false positive. An actual new vulnerability that the author didn’t know about was discovered. Again, the signal to noise ratio is very low, and one has to sift through a lot of wrong reports to get a realistic one, but this proved that it could be used as a useful tool for helping to detect vulnerabilities.

    I highly recommend reading the blog post.

    As much as I like to be critical about AI, it doesn’t help if we put our heads in the sand and act as if it never does something cool.








  • While technically correct, they do have it in China itself, it’s a modified version called Douyin. It is more restricted, censored and tightly controlled.

    I agree that it is a cyberweapon, but don’t think that it’s only used against foreigners, they use it just as much to observe and influence their own population.

    Finally, I would like to point out that to a lesser extent this is also the case for a lot of USA owned social media and tech companies. Edward Snowden’s revelations for example indicate this. While the extent of government control and influence is much larger in China, I wouldn’t underestimate the influences of Meta, Google and Microsoft for example.




  • That’s a very interesting point of view, and indeed well formulated in the video!

    I don’t necessarily agree with it though. I as a human being have grown up and learned from experience and the experiences of previous humans that were documented or directly communicated to me. I can see no inherent difference with an artificial intelligence learning on the same data.

    I never did all the experiments, nor the research previous scientists did, but I trust their reproducibility and logical conclusions. I think on the same way, artificial intelligence could theoretically also learn these things based on previous documented findings. This would be an ideal “général intelligence” AI.

    The main problem I think, is that AI needs to be even more computationally intensive and complex for it to be able to get to these advanced levels of understanding. And at this point, I see it as a fun theoretical exercise without actual practical benefit: the cost (both in money, time and energy) seems far too large to eventually create something that we can already do as humans ourselves.

    The current state of LLMs is one of very basic “semblance” of understanding, and close to what you describe as probability based conversation.

    I feel that AI is best at doing very specific tasks, were the problem space is small enough for it to actually learn the underlying model. In the same way I think that LLMs are best at language: rewriting text or generating stuff. What companies seem to think though is because a model is wel at producing realistic language, that it is also competent at the contents of what it is writing. And again, for that to be true, it needs a much more advanced method of calculation than is currently available.

    Take this all with a grain of salt though, as I am no expert on the matter. I am an electrical engineer who no longer works in the sector due to mental issues, but with an interest in computer science.



  • While I understand where you’re coming from, I believe that it distracts from a massive positive effect that the GPL has: the way it ensures collaboration. Lots of contributors to GPL software do so in the knowledge that they are working on something great together. I myself have felt discouraged to contribute to MIT licensed software, because I know that others might just take all the hard work, make something proprietary of it and give nothing back.

    I see GPL as some sort of public transaction, it is indeed more limiting than MIT and offers less pure freedom in that sense. But I just love how it uses copyright not for enforcing licensing payment for some private entity, but enforces a contribution to the community as a whole. I find this quite beautiful.



  • Thank you for taking the time to respond. With siphoning money, I mean not giving actual value in return. The NFT market was a clear example of this: get some hype going, sell the promise of great gains on your investment, once the ball gets rolling make sure you’re out before they realise it’s actually worth nothing. In the end, some smart and cunning people sucked a lot of money from often poor and misinformed small investors.

    I think I have an inherent idea of value, as in: the value it has in a human life and the amount of effort needed to produce it. This has become very detached from economical value, as there you can have speculation, pumping value and all that other crap. I think that’s what frustrates me about the current financial climate: I just want to be able to pay the people who helped produce the product I buy fairly with respect to how much time and work they put it. Currently however, so much money is being transferred to people “just for having money”. The idea that money in and of itself can make more money is such a horrible perversion of the original idea of trade…


  • Your last paragraph is not how money should work at all. Money should represent value that ideally doesn’t change, so that the money I receive for selling a can is worth a can, not a Lambo an not a grain of sand. What your describing is closer to speculation and pyramid schemes (NFTs for example).

    Either try and explain to me how BTC could be an ideal currency that fixes the problems in existing currency, or try to explain me how it’s really cool as an investment thing to siphon money from others, but don’t try and do both at the same time.